Telephone and face to face methods of assessment of veteran's community reintegration yield equivalent results
© Resnik et al; licensee BioMed Central Ltd. 2011
Received: 23 March 2011
Accepted: 25 June 2011
Published: 25 June 2011
The Community Reintegration of Service Members (CRIS) is a new measure of community reintegration developed to measure veteran's participation in life roles. It consists of three sub-scales: Extent of Participation (Extent), Perceived Limitations with Participation (Perceived), and Satisfaction with Participation (Satisfaction). Testing of the CRIS measure to date has utilized in-person administration. Administration of the CRIS measure by telephone, if equivalent to in-person administration, would be desirable to lower cost and decrease administrative burden. The purpose of this study was to test the equivalence of telephone and in-person mode of CRIS administration.
A convenience sample of 102 subjects (76% male, 24% female, age mean = 49 years, standard deviation = 8.3) were randomly assigned to received either telephone interview at Visit 1 and in-person interview at Visit 2, or in-person interview at Visit 1 and telephone interview a Visit 2. Both Visits were conducted within one week. Intraclass correlation coefficients, ICC (2,1), were used to evaluate correspondence between modes for both item scores and summary scores. ANOVAs with mode order as a covariate were used to test for presence of an ordering effect.
ICCs (95%CI) for the subscales were 0.92 (0.88-0.94) for Extent, 0.85 (0.80-0.90) for Perceived, and 0.89 (0.84-0.93) for Satisfaction. No ordering effect was observed.
Telephone administration of the CRIS measure yielded equivalent results to in-person administration. Telephone administration of the CRIS may enable lower costs of administration and greater adoption.
More than 2 million U.S. troops have been deployed in recent conflicts in Iraq and Afghanistan (Operation Enduring Freedom/Operation Iraqi Freedom/[OEF/OIF]). The toll of these wars is high, with 31,800 troops wounded (as of May 2010) and an expected 790,000 expected to seek disability benefits for service related health problems. Returning service members have been reported to face a wide range of problems in returning to community life including psychological problems, mild traumatic brain injury, marital and financial difficulty, problems with alcohol or substance abuse, and motor vehicle accidents [2–5].
A recent survey found that more than half (52%) of OEF/OIF Veterans had problems controlling anger, 49% reported that their participation in community activities had been impacted, and 42% reported problems getting along with an intimate partner . A quarter of returning Veterans reported problems in employment and almost as many (20%) reported legal problems.
It is a Department of Veterans Affairs (VA) priority to help these OEF/OIF Veterans return to full participation in community life roles. Thus, measurement of community reintegration is needed to track Veteran health and social functioning and assess the impact of treatment and policy. The Community Reintegration of Service Members (CRIS) is a new measure of community reintegration developed with VA funding to measure participation in life roles as defined by the International Classification of Health and Functioning (ICF).
Items on the CRIS cover 9 aspects, called chapters in the taxonomy of Activities and Participation as described by the ICF: (1) Learning and Applying Knowledge, (2) General Tasks and Demands, (3) Communication, (4) Mobility, (5) Self-care, (6) Domestic Life, (7) Interpersonal Relationships, (8) Major Life Areas, and (9) Community, Social and Civic Life. The CRIS's three scales measure three dimensions: (1) objective and (2) subjective aspects of participation as well as (3) satisfaction with participation. Items from the CRIS measure are shown in Additional File 1, Appendix A. The Extent of Participation scale asks the respondent to indicate how often he or she experiences or participates in specific activities. The Perceived Limitations in Participation scale asks the respondent to indicate his or her perceived limitations in participation. Lastly, the Satisfaction with Participation scale asks the respondent to indicate the degree of satisfaction with different aspects of participation. In designing the CRIS fixed form scales, we included only those items that demonstrated intraclass correlation coefficients (ICCs) > 0.6 in our pilot same-mode test-retest reliability studies .
Previous research showed that the three fixed form CRIS scales demonstrated strong reliability, conceptual integrity and construct validity[7, 8]. These findings suggest that the CRIS measure possesses strong psychometric properties and support its use as a standardized assessment measure for the monitoring of community reintegration outcomes of Veterans and wounded warriors from recent conflicts.
All testing of the CRIS measures prior to this study utilized in person survey administration. However, administration of the CRIS measure by telephone would expand the utility of the CRIS by lowering the cost and decreasing the burden of administration; and therefore, ultimately increasing the likelihood of the measure's adoption. Telephone surveys do not require travel, are not affected by geographic distribution of subjects, and are easily monitored for quality. Thus, they may be a more economical means of conducting interviews. That said, we were concerned, based on the prior literature, that telephone and in-person administration might yield varying results due to: (a) the CRIS's complex response format which could be confusing by telephone administration,  (b) cognitive demands of completing the survey by telephone, [12–14] and (c) greater potential for social desirability bias for in-person interviews [15, 16]. Previous studies have reported an ordering effect in repeat administration of quality of life measures using telephone versus mail administration , and telephone versus web administration,  and recommend that mixing of questionnaire modes be avoided when gathering certain types of data [17, 19]. Thus, we examined potential ordering effects in our analyses.
No prior studies have examined the effect of interview mode, or the effect of mode ordering on the responses of subjects to questions related to their community reintegration. Thus, the overall purpose of this study was to test the equivalence of mode of survey administration of the CRIS measure. Specifically, we examined concurrent criterion validity of the telephone administration of the CRIS, examined whether patient responses to the CRIS measure varied by mode of survey administration (telephone or in-person); and examined whether or not order of survey mode administration (telephone or in-person) was associated with differences in score means and variances. We hypothesized that 1) CRIS scores derived from the telephone administration would be equivalent to those derived through in-person administration and 2) order of survey mode administration would not influence CRIS scores.
A convenience sample of 102 subjects from the Providence VA Medical Center (PVAMC) was recruited. The Institutional Review Board of the PVAMC approved the research study.
Prior to full-scale study implementation, the interview script was modified to facilitate telephone administration and refined based on experiences during pilot testing with 5 subjects. After completion of the pilot testing, prospective subjects who expressed an interest in study participation were scheduled for an in-person visit with a research assistant whose sole function was to recruit, schedule and consent subjects. After the consent was completed, subjects were randomly assigned to one of two groups and scheduled for interviews. The first group was administered the telephone interview in the first session followed by an in-person interview in a second session. The second group was administered the in-person interview in the first session followed by the telephone interview in a second session. The two data collection sessions for each participant took place within one week. To minimize the possibility of social desirability bias in the telephone-first group, all interviews were conducted by a second research assistant who had not been involved in the recruitment, initial scheduling or consent process.
Demographics by Randomization Group (N = 102)
In-Person followed by Telephone
(n = 50)
Telephone Followed by In-Person
(n = 52)
(n = 102)
Mean (SD) Range
Mean (SD) Range
Mean (SD) Range
CRIS Extent of Limitations
50.1 (8.1) 28-65
51.3 (7.3) 25-63
50.7 (7.6) 25-65
CRIS Perceived Limitations
51.2 (10.0) 26-70
51.2 (8.8) 29-70
51.2 (9.3) 26-70
51.7 (10.0) 25-70
53.0 (9.4) 24-69
52.4 (9.5) 24-70
50.0 (8.6) 24-59
49.3 (8.1) 23-59
49.6 (8.3) 23-59
Live with children under 18
Not working due to disability
Retired, not working
Retired and Working
Less than 15K
15K to 25K
25K to 35K
35k to 50k
50K to 75K
Mental Illness Diagnosis
Alcohol/Drug abuse Diagnosis
We compared characteristics of the two groups: telephone administration first and in-person administration first, using t-tests for continuous variables and chi-square tests for categorical variables. We used intraclass correlation coefficients, ICC (2,1), to evaluate correspondence between modes for both item scores and summary scores. We used the Shrout & Fleiss (type 2,1) intraclass correlation coefficient, a two-way random effects single measure reliability, where the target and the number of measurements on each target are random effects, and the unit of analysis is the individual measurement instead of the mean of measurements. ICCs above 0.5 were considered as an indication of moderate consistency between modes. Items with ICCs lower than 0.5 were inspected for content. Box plots of mean score difference between mode, stratified by type of first interview mode (telephone or in-person), were used to visually display possible modal or ordering effect. Finally, ANOVAs on summary scores with mode order as a covariate were used as a statistical test for presence of any ordering effect.
One hundred and two subjects were randomized into two groups. Subjects in group 1 were administered the CRIS measure in-person at Visit 1 and by telephone at Visit 2, and subjects in Group 2 were administered the CRIS measure by telephone at Visit 1 and in-person at Visit 2. Table 1 shows the characteristics of the subjects by group. No statistically significant differences between groups were observed for any of the characteristics shown in Table 1.
Consistency of CRIS Scale Scores by Mode of Administration (Telephone and In-person)
Extent of Participation
Satisfaction with Participation
Items in CRIS Scales with ICCs below 0
How often did you engage in risky behavior?
How often were you able to do several things in a row, such as following directions or doing several tasks one after the other?
How often did you fulfill all of the duties of your job?
I remembered what I read.
I got along with people at work.
I was limited in following directions.
I was limited in keeping track of my daily tasks and activities.
Others expressed distress while being a passenger in my car.
I was limited in doing volunteer activities.
How satisfied were you with your job performance?
Results of ANOVAs of summary scores examining differences between mode of administration and order of interview mode
Order × Mode
This study tested the comparability of telephone and in-person modes of administration of a new measure of community reintegration for veterans, called the CRIS. We found, based upon ICCs ranging from 0.85 to 0.92, that summary scores for the three CRIS subscales were largely comparable between modes. The cut-point for acceptable reliability coefficients varies by field of study, with separate values acceptable for different applications. Generally, speaking ICCs above 0.85 are considered acceptable to make decisions about individuals . Nunnally recommends a minimum reliability of 0.70 for use of a scale in research and 0.90 for use in clinical practice . As a point of reference, only two of the widely used scales of the SF-36 have reliabilities above 0.90 .
To confirm that our sample size of 102 persons was adequate, we conducted post-hoc power calculations. For the reliability analysis, we estimate that we have achieved power of 80% to detect an ICC of 0.9 under the alternative hypothesis (which is the approximate value for CRIS subscale ICCs), when the ICC under the null hypothesis is 0.81, using an F-test with alpha = 0.05, and two samples of 50 persons each .
We found that 141/151 (93%) of items had ICCs of 0.5 or above, indicating moderate reliability at the item level. However, we did note that 10 of 151 CRIS items (< 7%) had ICCs below 0.5, indicating potential non-equivalence of telephone and in-person administration modes for these items. These items included ones about working, risk taking, and multitasking. These findings should be interpreted cautiously because confidence intervals for the ICC estimates in the current study were wide, and the higher bound of the confidence limits for all items exceeded 0.5. Three items with ICC point values below 0.5 were questions about participation in work or work situations. We believe that these items had very large confidence intervals due to the low percentage of respondents who were working (37%) and the smaller number of subjects who answered each of these questions.
The CRIS scales utilize a complex response format consisting of 7-point Likert-like response scales. There are multiple types of response scales in the measure, each with differing categories of responses (See Additional File 2, Appendix B for response scales). Prior research on telephone versus in-person administration reports both advantages and disadvantages of each mode as well as equivalence between modes. De Vaus suggests that in-person interviews may be preferable for surveys of complex questions with multiple response categories because telephone respondents may have difficulty remembering multiple categories when they answer questions with a large number of response categories. While telephone respondents may have response cards mailed to them in advance of an interview, for practical purposes this is less than optimal because it requires advance planning and assumes that respondents refer to the cards appropriately during the interview. Because of this, we did not mail response cards in this study. In contrast, in-person respondents have a visual aid, in the form of the response scale displayed in front of them as they answer each item, as well as an interviewer who can respond to facial expressions suggesting confusion and who can point to the appropriate response display while explaining the item.
Telephone respondents have been reported to be less patient with interviews and to avoid conversation that may lengthen the interview. Some data suggest that telephone interviews are generally completed more quickly than equivalent in-person interviews . Telephone respondents are in an uncontrolled environment, may be distracted during interviews by things in their environment or they may be multi-tasking at home-by watching TV, cooking or even interacting with others while responding to the interviewer. Thus, they may be less likely to exert the mental effort to answer questions carefully. A respondent answering a long survey may lose motivation, become fatigued and/or lose focus and be unable to sustain the mental effort needed to carefully consider and answer survey questions. When these things occur, the respondent may be more likely to respond in a manner that they believe would seem acceptable or reasonable to the interviewer. Non-verbal cues provided through face-to-face interviewing could potentially enhance the motivation of subjects, keeping them more engaged and thus more likely to respond carefully. Furthermore, the more controlled environment of a face-to-face interview can minimize distractions. While we had no way to monitor telephone a respondent's behavior (i.e. potential distractions from multi-tasking), our results suggest that the potential effect on survey responses was negligible.
While in-person respondents may be motivated by the development of greater rapport and enhanced task performance, the presence of an interviewer may create other biases. Face to face interviews may be more biased due to respondents' desire to express socially acceptable characteristics, and may be influenced by the gender and other observable characteristics of the interviewer. Previous research suggests that social desirability bias is more likely to occur when questions relate to sensitive topics such as sexuality, drug use and risk taking behavior; topics that are included in the CRIS .
Greater physical distance between the respondent and the interviewer may provide a greater sense of safety and lead to responses that are more candid. Thus, one would expect that face-to-face interviews would diminish social distance and lead to greater social desirability bias in survey responses because the respondent is observed directly by the interviewer who can respond to non-verbal signs of approval, or disapproval in the form of facial expression or body language. This is confirmed by reports that suggest that the greater anonymity associated with telephone surveys yield more candid reports of risky or socially disapproved behavior [25, 26]. However others researchers have reported the opposite effect, indicating that respondents to in-person interviews were more likely to report vulnerabilities such as disability, than respondents to telephone interviews [13, 27]. It is possible that potential social desirability bias related to sensitive behavior might impact several of the CRIS items, particularly those related to risky behavior and frequency of sexual activities .
While it is possible that the lower ICC values of the items related to risk taking behavior and driving safety that we observed in this study might be attributable to social desirability bias, we do not believe that this was the case. If social desirability was a factor, we might expect that subjects would report higher functioning (i.e. higher scores) during the in-person interview as compared to the telephone interview. We would also have expected to find a lower ICC value for the item related to frequency of sexual relations. Our examination of the raw data shows that the mean of the responses to the question, "How often did you engage in risky behavior?" was lower (mean = 6.1, sd = 1.6) for the in-person then it was for the telephone administration (mean = 6.5; sd = 1.2). The mean of the responses to the items, "Others expressed distress while being a passenger in my car," were nearly identical: 5.6 (sd 1.5) for the in person administration and 5.6 (sd 1.4) for the telephone administration. None of these differences were statistically significant. Thus, we believe that the lower ICCs resulted from the wide confidence intervals around the point estimate, rather than differences between modes of administration.
There were five additional items with ICCs below 0.5. Because these items related to multitasking, remembering what was read, keeping track of daily tasks and activities, and limitations in volunteer work we would not have expected them to be particularly affected by social desirability bias. Examination of the raw data (not shown) shows nearly identical means scores for the groups, suggesting that the lower ICC values were not a substantial concern, and reflected a lack of precision around the estimates in this sample. Additional research is necessary to confirm this finding.
Our study design limits inferences about whether or not potential differences in item responses between modes were attributable to the mode of survey administration or to the actual test-retest reliability of the item. Test-retest reliability is not an inherent property of a measurement instrument, but can vary by population. However, prior research using repeat administration of the in-person CRIS in a very similar sample showed that all items had ICCs of > 0.6. Further research testing equivalence of mode of administration is needed to confirm our current findings.
In conclusion, there appears to be good potential to use the CRIS fixed form measure by telephone administration. The overall scores were comparable between modes and ICC values for the total scores, and 93% of items indicated acceptable reliability. Since publication of the original article describing CRIS development, the author has received multiple inquiries regarding use of the CRIS measure for research, surveillance and clinical assessment of Veterans. Based upon this research, we believe that use of telephone administration is justified by the overall score equivalence, increased convenience and lower cost of this mode of administration.
Linda Resnik, PT, PhD is a Research Health Scientist at the Providence VA Medical Center and Associate Professor (Research) in the Department of Community Health, Brown University, Providence, RI
Melissa A. Clark, PhD is Associate Professor, Department of Community Health and Obstetrics and Gynecology, Brown University
Matthew Borgia, BS is a graduate student in the Department of Biostatistics, Brown University
Acknowledgements and Funding
This research and the time and effort of all authors were supported by the Department of Veterans Affairs, Veterans Health Administration, Office of Research and Development HSR&D DHI-07-144.
The authors would like to acknowledge Regina Lynch and Pam Steager for their assistance with subject recruitment and data collection.
- Iraq Index. [http://www.brookings.edu/saban/iraq-index.aspx]
- Institute of Medicine: Returning Home from Iraq and Afghanistam: Preliminary Assessment of Readjustment Needs of Veterans, Service Members, and Their Families. 2010, Washington, DC: National Academies PressGoogle Scholar
- Hoge CW, Auchterlonie JL, Milliken CS: Mental health problems, use of mental health services, and attrition from military service after returning from deployment to Iraq or Afghanistan. Jama. 2006, 295 (9): 1023-1032. 10.1001/jama.295.9.1023.View ArticlePubMedGoogle Scholar
- Milliken C, Auchterlonie J, Hoge C: Longitudinal assessment of mental health problems among active and reserve component soldiers returning from the Iraq war. JAMA. 2007, 298: 2141-2148. 10.1001/jama.298.18.2141.View ArticlePubMedGoogle Scholar
- Sayer NA, Chiros CE, Sigford B, Scott S, Clothier B, Pickett T, Lew HL: Characteristics and rehabilitation outcomes among patients with blast and other injuries sustained during the Global War on Terror. Arch Phys Med Rehabil. 2008, 89 (1): 163-170. 10.1016/j.apmr.2007.05.025.View ArticlePubMedGoogle Scholar
- Sayer N, Noorbaloochi S, Frazier P, Carlson K, Gravely A, Murdoch M: Reintegration problems and treatment interests among Iraq and Afghanistan combat veterans receiving VA medical care. Psychiatr Serv. 2010, 61 (6): 589-597. 10.1176/appi.ps.61.6.589.View ArticlePubMedGoogle Scholar
- Resnik L, Plow M, Jette A: Development of the CRIS: A Measure of Community Reintegration of Injured Services Members. Journal of Rehabilitation Research and Development. 2009, 46 (4): 469-480. 10.1682/JRRD.2008.07.0082.View ArticlePubMedPubMed CentralGoogle Scholar
- Resnik L, Gray M, Borgia M: Measurement of community reintegration in sample of severely wounded servicemembers. J Rehabil Res Dev. 2011, 48 (2): 89-102. 10.1682/JRRD.2010.04.0070.View ArticlePubMedGoogle Scholar
- Weeks M, Kulka R, Lessler J, Whitmore R: Personal versus Telephone Surveys For COllecting Household Health Data at the Local Level. American Journal of Public Health. 1983, 73 (12): 1389-1394. 10.2105/AJPH.73.12.1389.View ArticlePubMedPubMed CentralGoogle Scholar
- Warner J, Bermna J, Weyant J, Ciarlo J: Assessing mental health program effectiveness: a comparison of three client follow-up methods. Evaluation Review. 1983, 7: 635-658. 10.1177/0193841X8300700503.View ArticleGoogle Scholar
- De Vaus DA: Surveys in social research. 1995, St. Leonards, NSW: Allen & Unwin, 4Google Scholar
- Schuman H, Presser S: Questions and answers in attitude surveys: experiments on question form, wording, and context. 1981, New York: Academic PressGoogle Scholar
- Holbrook A, Green M, Krosnick J: Telephone Versus Face-to-Face Interviewing of National Probability Samples with Long Questionnaires. Public Opinion Quarterly. 2003, 67: 79-125. 10.1086/346010.View ArticleGoogle Scholar
- Krosnick J, Narayan S, Smith W, (Eds): Satisficing in surveys: Initial evidence. 1996, San Francisco: Josse-BassGoogle Scholar
- Drolet A, Morris M: Rapport in Conflict Resolution: Accounting for How Face-to-Face Contact Fosters Mutual Cooperation in Mixed-Motive Conflicts. Journal of Experimental Social Psychology. 2000, 36: 26-50. 10.1006/jesp.1999.1395.View ArticleGoogle Scholar
- Tourangeau R, Smith T: Asking Sensitive questions: The impact of data collection, question format, and question context. Public Opinion Quarterly. 1996, 69: 275-304.View ArticleGoogle Scholar
- Hays RD, Kim S, Spritzer K, Kaplan R, Tally S, Feeny D, Liu H, Fryback D: Effects of Mode and Order of Administration on Generic Health-Related Quality of Life Scores. Value in Health. 2009, 12 (6): 1035-1039. 10.1111/j.1524-4733.2009.00566.x.View ArticlePubMedPubMed CentralGoogle Scholar
- Greene J, Wiitala W: Telephone and Web: Mixed-Mode Challenge. Health Services Research. 2007, 43 (1): 230-248. 10.1111/j.1475-6773.2007.00747.x.View ArticleGoogle Scholar
- Laungenahusen M, Lange S, Maier C, Schaub C, Trampisch H, Endres H: BMC Medical Research Methodology. 2007, 7 (50):Google Scholar
- Shrout PE, Fleiss JL: Intraclass correlations: uses in assessing rater reliability. Psychol Bull. 1979, 86 (2): 420-428.View ArticlePubMedGoogle Scholar
- Winer EA, Stewart BJ: Assessing Individuals. 1984, Boston, MA: Little BrownGoogle Scholar
- Nunnally JC: Psychometric Theory. 1978, New York, New York: McGraw-HillGoogle Scholar
- Streiner D, Norman G: Health Measurement Scales: a practical guide to their development and use. 2003, New York: Oxford University PressGoogle Scholar
- Donner A, Eljaszw M: Sample size requirements for reliability studies. Statistics in medicine. 1987, 6 (4): 441-448. 10.1002/sim.4780060404.View ArticlePubMedGoogle Scholar
- McQueen D: Comparison of results of personal interviews and telephone surveys of behavior related to risk of AIDS: Advantages of telephone techniques. Health survey research methods, Rockville, MD. 1989Google Scholar
- Hochstim J: A critical comparison of three strategies of collecting data from households. J Clin Epidemiol. 1998, 51 (11): 961-967. 10.1016/S0895-4356(98)00087-0.View ArticleGoogle Scholar
- Aneshensel CS, Frerichs RR, Clark VA, Yokopenic P: Telephone versus in-person surveys of community health status. Am J Public Health. Am J Public Health. 1982, 72 (9): 1017-1021. 10.2105/AJPH.72.9.1017.View ArticlePubMedPubMed CentralGoogle Scholar
- Rothstein J, Echternach J: Primer on Measurement: An Introductory Guide to Measurement Issues. 1993, Alexandria, VA: American Physical Therapy Association, 83-84.Google Scholar
- The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1471-2288/11/98/prepub