Re-evaluating a vision-related quality of life questionnaire with item response theory (IRT) and differential item functioning (DIF) analyses
© van Nispen et al; licensee BioMed Central Ltd. 2011
Received: 16 September 2010
Accepted: 2 September 2011
Published: 2 September 2011
For the Low Vision Quality Of Life questionnaire (LVQOL) it is unknown whether the psychometric properties are satisfactory when an item response theory (IRT) perspective is considered. This study evaluates some essential psychometric properties of the LVQOL questionnaire in an IRT model, and investigates differential item functioning (DIF).
Cross-sectional data were used from an observational study among visually-impaired patients (n = 296). Calibration was performed for every dimension of the LVQOL in the graded response model. Item goodness-of-fit was assessed with the S-X2-test. DIF was assessed on relevant background variables (i.e. age, gender, visual acuity, eye condition, rehabilitation type and administration type) with likelihood-ratio tests for DIF. The magnitude of DIF was interpreted by assessing the largest difference in expected scores between subgroups. Measurement precision was assessed by presenting test information curves; reliability with the index of subject separation.
All items of the LVQOL dimensions fitted the model. There was significant DIF on several items. For two items the maximum difference between expected scores exceeded one point, and DIF was found on multiple relevant background variables. Item 1 'Vision in general' from the "Adjustment" dimension and item 24 'Using tools' from the "Reading and fine work" dimension were removed. Test information was highest for the "Reading and fine work" dimension. Indices for subject separation ranged from 0.83 to 0.94.
The items of the LVQOL showed satisfactory item fit to the graded response model; however, two items were removed because of DIF. The adapted LVQOL with 21 items is DIF-free and therefore seems highly appropriate for use in heterogeneous populations of visually impaired patients.
KeywordsVisual impairment Vision-related quality of life Item response theory Graded response model Differential item functioning
The detrimental effects of living with vision loss caused by irreversible eye conditions (such as age-related macular degeneration or diabetic retinopathy) are well reported . Research in low vision has primarily focused on older adult populations, because of increased prevalence of age-related eye conditions in older age [2–8]. Those studies used several vision-related quality-of-life questionnaires which allow to assess the disability suffered in daily life [9, 10]. In their review, de Boer et al. reported that the original Low Vision Quality Of Life questionnaire (LVQOL) was one of the best for use in patients with low vision [11, 12]; its items are mainly related to difficulties people have in performing certain activities due to their visual disability. In a few studies within the framework of classical test theory, de Boer et al. translated and further validated the Dutch version of the LVQOL [13, 14]. In two subsequent studies on the longitudinal outcomes of low vision rehabilitation, additional comments on the validity of the LVQOL were made using item response theory (IRT); however, a calibration-process was not performed [5, 15]. In these studies, which were performed on the data previously used by de Boer et al. [13, 14], it was concluded that on the dimension "Reading and fine work", the item invariance assumption did not hold over time. The lack of item invariance might have been a redundant phenomenon if the items had been calibrated in an IRT model beforehand.
Nowadays, IRT models are recommended for evaluating patient-reported outcomes; some questionnaires have been re-evaluated using the Rasch model [9, 16–19], which is considered a special case of an IRT model . IRT models represent a collection of statistical models for item analysis in questionnaires that measure a latent construct, i.e. vision-related quality of life, and for estimating individual scores for the construct, based on responses to the items. Another IRT model is the graded response model (GRM), which is a cumulative probability model. Although the Rasch model has favorable measurement properties, such as statistical sufficiency and specific objectivity, it is often too restrictive, especially for existing tests (developed in the classical test theory framework). For evaluative purposes, less constrained models such as the GRM often give a more realistic reflection of the data compared to Rasch or partial credit models . Furthermore, from studies on cognitive processing in which it is investigated how response options are chosen, the GRM seems most appropriate for Likert-type items [21–23]. Another advantage of the GRM is that although a normal distribution of the latent variable is assumed, the model is quite robust to slight deviations from normality [24, 25].
In an IRT calibration process some steps need to be taken, such as assessing item fit and differential item functioning (DIF) . A large proportion of items with DIF is a severe threat to its construct validity and thus to the ability to draw conclusions based on the test scores . Variables that potentially lead to DIF are demographic variables. A DIF analysis allows to examine the relationship between item responses and another variable, such as gender or age group, conditional on a measure of the latent construct, i.e. vision-related quality-of-life . Disease-related variables may also lead to DIF, e.g. items may be interpreted differently by patients with different eye conditions, but with a similar disability level. Although vision-related quality-of-life questionnaires measure at the disability level, items could be problematic to patients in different ways due to differences in visual impairment, such as visual acuity or field loss. This could indicate whether there should be separate calibrations for populations with specific eye conditions  or demographic variables.
Since the LVQOL has not yet been calibrated, it remains unknown whether the items appropriately fit an IRT model. Therefore, the present study evaluates some essential psychometric properties of the LVQOL, including assessing item goodness-of-fit and DIF between subgroups.
Design and participants
Cross-sectional data were obtained from a longitudinal study: i.e. visually impaired older patients of an observational study on the vision-related quality-of-life effects of two types of low-vision rehabilitation (optometric service and multidisciplinary rehabilitation service) [4, 10]. Consecutive patients (n = 357) were recruited from the ophthalmology departments of four hospitals in the Netherlands between July 2000 and January 2003. The eligibility requirements for inclusion in the study were referral to either the optometrist or the multidisciplinary low-vision service by an ophthalmologist, age over 50 years, no previous contact with low-vision rehabilitation services, irreversible vision loss, adequate understanding of the Dutch language, and adequate cognitive abilities. Patients who met the inclusion criteria were informed about the study and were invited to participate. From the eligible patients 17.1% did not participate. Baseline data were available of 296 visually impaired patients. Written consent was obtained from all participants. The study protocol was approved by the Medical Ethics Committee of the VU University Medical Center, and conducted according to the principles of the Declaration of Helsinki.
Demographic variables and other characteristics (e.g. age, gender and main cause of vision loss) were taken from the patients' hospital charts. Rehabilitation type was either the optometric, or multidisciplinary service. Distance visual acuity was assessed for all participants by their ophthalmologist by projection and with habitual correction for both eyes separately. To enable meaningful computations, decimal visual acuity values were transformed to logMAR values, where higher values represent more vision loss, or lower visual acuity values.
The LVQOL was previously forward and backward translated by two different native speakers on separate occasions. Few dissimilarities were resolved . In the present study, the Dutch version of the LVQOL was re-evaluated. The questionnaire was in large print and was completed by the patients either independently or with assistance from others. The 25 items on the LVQOL are mainly related to difficulties people have in performing certain activities due to their visual disability, ranked on a 6-point Likert-type scale: 0 "No problem" to 5 "Not able because of vision". In our previous study two items were removed from the questionnaire , therefore this report is based on 23 items.
Validation and statistical analyses
Assessing dimensionality and local independence
Unidimensionality is a critical assumption of IRT. It refers to whether a person's response to an item that measures a construct is accounted for by the level on that trait, and not by other factors . In a previous study, dimensionality of the LVQOL was investigated on baseline data of the low-vision rehabilitation effect study . In summary, an exploratory factor analysis on polychoric correlations and Promax rotation in Mplus version 3.13 was carried out. The model parameters were estimated applying weighted least squares with mean and variance correction (WLSMV). Item 5 "Problems reading street name signs" and item 25 "Problems doing household tasks" had low factor loadings and interpretation of factors was confusing (both items loaded almost equally on two factors). After removing items 5 and 25, the factor analysis yielded four dimensions: "Mobility", "Reading and fine work", "Adjustment" and "Basic aspects" (explained variance 75%). The root mean-square residual, which is an index of global model fit, was satisfactory: i.e. 0.03 and, factor loadings were all higher than 0.40. The Cronbach's alpha-values for these were 0.84, 0.90, 0.82 and 0.93, respectively.
To further prepare for the IRT analyses, we assessed local independence of items by inspection of possible excess correlation among items in the residual correlation matrix. Local dependence could arise from items with a similar content or wording. Inspection of the residual correlations showed that it was highest between items 17 "Reading large print" and 24 "Using tools" (-0.11), but the other residual correlations were never higher than 0.09 and were therefore not considered to be a problem. The psychometric properties of the LVQOL dimensions were further assessed with an IRT model.
In the present study, we used the GRM to evaluate the LVQOL , which is a generalization of the two-parameter logistic model.
It is assumed that the prior distribution of the person parameter (θ s ) is standard normal (mean 0; SD 1) . The item parameters were estimated in MULTILOG by the method of marginal maximum likelihood . Subsequently, posterior estimates of θ s can be obtained.
Even after unidimensionality and local independence have been investigated, some items might have remained that do not fit the GRM. Applications of IRT implicitly assume that the model is correct; that is, expected item scores should increase monotonically and the item response model should reflect the data accurately. Although a certain amount of misfit is inherent to every model, considerable misfit should be avoided. Item fit can be examined by comparing model predictions (expectations) and observed data . By using item tests, decisions can be made as to whether it is necessary to remove any items. Therefore, item goodness-of-fit was investigated with an item test by Bjorner et al. , which is implemented in SAS [31, 32]. This item-test is an extension (generalization) of the item test with dichotomous response categories which was developed by Orlando and Thissen and is known as the S-X2-test [33, 34]. Items were considered as misfitting to the model if p < 0.01.
Examining DIF is important in the investigation of the equivalence of items across subgroups differing in background characteristics [28, 35]. We investigated DIF on the subgroup variables age (arbitrarily chosen > or ≤75 years), gender (male versus female), main cause of vision loss in the best eye (age-related macular degeneration versus other eye conditions), rehabilitation type (optometrist versus multidisciplinary service), logMAR visual acuity level (≥ 0.52; low vision/blindness or < 0.52; mild vision loss), and types of administration (self-reported versus assisted by a significant other who filled out the questionnaire together with the patient). Two types of DIF were investigated: uniform DIF indicates that the item bias is in the same direction at all levels of the disability continuum, where one subgroup seems to have a consistently higher or lower likelihood to respond favorably to an item compared to its counterpart. In contrast to items with dichotomous response categories, for polytomous items this may vary for every β ij , i.e. without affecting α i . Non-uniform DIF indicates dissimilarity in α i between subgroups, conditional on the disability level, which reflects subgroup by ability interaction . DIF analyses were performed with software for the computation of statistics involved in IRT likelihood-ratio tests for DIF (IRTLRDIF) by Thissen [36, 37]. This approach tests the null hypothesis that α i is equal for two subgroups (absence of non-uniform DIF), yielding a Chi-square (G2) statistic with one degree of freedom, and the null hypothesis that the β ij is equal (absence of uniform DIF) between subgroups, using five degrees of freedom. IRTLRDIF is based on a hierarchical structure, which means that β ij is tested for uniform DIF, only if the test for α i is not significant. To correct for multiple testing, a p-value < 0.01 was indicated as statistically significant occurrence of DIF.
To gain more insight into DIF items (particularly to examine the magnitude of DIF between subgroups), we calculated differences in expected scores for those subgroups. The magnitude of DIF was presented as the maximum difference between expected scores. When DIF cannot be resolved, a solution would be to separately estimate item parameters for subgroups; those parameters can subsequently be used to estimate the person parameter (θ s ) . Another solution is to remove the item. In the present study, items were removed on the basis of the magnitude of DIF which was determined by a large difference (> 1 point) between expected scores on the item; if there was DIF between more than one subgroup variable; or if DIF was present on a relatively large part of the disability continuum. After removing DIF items, the dimensions of the LVQOL were re-calibrated and DIF analyses were repeated to see whether other DIF items would resolve. Subsequently, 'test information' was presented for the dimensions of the LVQOL. Test information refers to the range of the underlying construct over which (a dimension of) a test is most useful to distinguish between respondents. Therefore, information represents the reliability or measurement precision. The inverse of the square root of the information function is equivalent to the standard error (SE) of θ s . Test information for the separate dimensions of the LVQOL was analyzed in MULTILOG  and the corresponding curves presented. Finally, the reliability coefficient was calculated for θ s of the separate LVQOL dimensions (index of subject separation) .
Patient characteristics (N = 296)
< 75 years
LogMAR visual acuity*
< 0.52 mild vision loss
≥ 0.52 low vision/blindness
Main cause of vision loss in best eye*
Age-related macular degeneration
Other eye conditions
Item non-response and goodness-of-fit
The item non-response was 4.1% for "Basic aspects" (60 missing responses for 5 items); 4.8% for "Mobility" (71 missing responses for 5 items); 4.1% for "Adjustment" (61 missing responses for 5 items); and 4.8% for "Reading and fine work" (113 missing responses for 8 items). The total item non-response for the LVQOL was 4.5%. All items of the four separate LVQOL dimensions fit the GRM.
Differential item functioning
Items with DIF between subgroups of relevant variables
Eyes getting tired
Moderate vision loss
Seeing steps or curbs*
Moderate vision loss
Vision in general
Vision in general
Unhappy situation in life†
Reading and fine work
Although most subgroups were comparable on most characteristics, differences were found between the LogMAR visual acuity subgroups, where patients with low vision/blindness significantly more often received assistance by someone to fill out the questionnaire (68%) than patients with mild vision loss (52%; p = 0.006). In addition, significantly less patients who went to the optometric service needed assistance with filling out the questionnaire (56%) compared to those who received multidisciplinary rehabilitation (70%; p = 0.012). Relatively more patients with age-related macular degeneration were in the 75+ age category (89%) than patients with other eye-conditions (59%; p < 0.001).
Re-calibration after removing items
Item parameter estimates and fit statistics
Item content per dimension
Seeing moving objects
Eyes getting tired
Seeing the television
Glare (dazzled by lights)
Getting right amount light
Night vision inside house
Seeing steps or curbs
Getting around outdoors
Crossing a road with traffic
Understand eye condition
Unhappy situation in life
Frustration with doing tasks
Visiting friends and family
Reading and fine work
Reading large print
Reading letters and mail
Finding out the time
Reading own handwriting
DIF analyses were repeated for "Adjustment" without item 1 on the subgroup variable gender. DIF for item 12 resolved at the p < 0.01 level. DIF analyses were repeated for "Reading and fine work" without item 24 on the subgroup variable eye condition. Uniform DIF remained for item 19 (G2(5) = 18.1; p < 0.01) between patients with age-related macular degeneration and patients with other eye conditions. However, the difference in expected scores remained small. Consequently, item 19 was not removed from this dimension.
Finally, the indices of subject separation were high for all dimensions: "Reading and fine work" (0.94); "Mobility" (0.91); "Basic aspects" (0.86); and "Adjustment" (0.83).
The purpose of this study was to assess some essential psychometric properties of the LVQOL using an IRT model. Special attention was paid to investigating DIF on relevant background variables. All items of the four LVQOL dimensions fit the GRM, also after two items were removed because of DIF. DIF was found on five items between subgroups of gender, visual acuity, administration modes and eye conditions. However, only item 1 'Vision in general' of the "Adjustment" dimension and item 24 'Using tools' of the "Reading and fine work" dimension were considered to be a problem. Item 1 had DIF between the administration mode subgroups and gender subgroups, where the difference in expected item scores remained relatively large along a large part of the disability continuum. Patients who self-administered the questionnaire responded lower to this item conditional on their disability level than patients who were assisted by a significant other, which was often a relative or spouse (91.3%; n = 183). Wolffsohn et al. found that patients who were assisted by someone reported higher disability levels measured with the LVQOL; they concluded that the subgroup which was assisted with administration had more vision loss and reduced contrast sensitivity than the self-report subgroup, but also suggested that the difference might reflect a negative bias introduced by the patient's relative . An earlier study in which the psychometric quality of the Vision-related quality of life Core Measure was assessed in the same visually impaired patient group reported similar results with DIF present on two items . Patients who were assisted had significantly more vision loss (mean logMAR Visual Acuity 0.74; SD 0.43) than patients who self-reported (mean 0.56; SD 0.90); this may explain why patients who were assisted scored higher on the item, conditional on their disability level. Similar to Wolffsohn et al., another plausible explanation was the nature of the relationship between the patient and the significant other who assisted with administration. The significant other may have (unconsciously) conveyed his/her personal opinion, or the patient's perception of the characteristics of the significant other may have prompted a socially-desirable response . Furthermore, DIF on item 1 'Vision in general' between women and men was caused by a lack of responses in the highest response category.
There was a higher response to item 24 'Using tools' (e.g. using a hammer or threading a needle) of the "Reading and fine work" dimension by women than by men, conditional on the disability level. Because the difference in expected item scores was sufficiently large, and along a relatively large part of the disability continuum, it was decided to remove item 24.
A consequence of removing a differentially functioning item is that the psychometric quality of the underlying construct improves, i.e. vision-related quality of life and in particular the "Adjustment" and "Reading and fine work" dimensions. The four and seven remaining items on those dimensions, respectively, fit the GRM and DIF resolved for item 12 'Unhappy with situation in life'. Item 19 'Reading labels' continued to have DIF, but the difference in expected scores was small. The choice of removing an item with DIF is usually expressed by the difference in logits. A problem with polytomous item responses is that the difference in logits may vary for every threshold parameter, making the magnitude of DIF difficult to assess. Therefore, the difference in expected item scores was perceived as a helpful interpretation of the DIF magnitude . Another consequence of improvement of the dimensions "Reading and fine work" and "Adjustment" might be that item invariance across occasions can be assumed. However, after removing item 24 'Using tools', the assumption of item parameter invariance across time points could still not be maintained for the "Reading and fine work" dimension (data not shown). Consequently, further investigation and confirmation in other longitudinal studies may be necessary. In contrast, after removing item 1 'Vision in general', item invariance was assured across occasions for the "Adjustment" dimension, indicating that the outcome on this dimension can be appropriately assessed. A limitation of the present study may be that the subsets on which DIF was investigated were rather small (N < 100 in two subsets). Differences in patient characteristics found between subsets may have been caused by limited numbers of patients.
Finally, the test information curves provided insight into the separate dimensions of vision-related quality-of-life. The "Reading and fine work" and "Mobility" dimensions were most informative for differentiating between patients' disability levels in terms of vision-related quality-of-life.
The items of the LVQOL showed satisfactory item fit to the GRM; however, two items were removed because of DIF. The adapted (Dutch) LVQOL with 21 items is 'DIF-free' when relevant subgroups are considered, which means that the psychometric quality of the questionnaire has improved. Consequently, the LVQOL seems highly appropriate for use in heterogeneous populations of visually impaired patients.
RMAVN (PhD) is a psychologist and epidemiologist and has a special interest in the measurement of quality-of-life in the field of low vision. She received the Quality of Care Fellowship (200-2012) from the EMGO+ Institute for Health and Care Research. DLK (PhD) is a statistician and is specialized in psychometrics and specifically in item response theory. ML (PhD) is a human movement scientist, epidemiologist and a former occupational therapist and researcher in the field of low vision. GHMBVR (PhD) is a professor of ophthalmology and holds a chair in the field of low vision.
List of abbreviations
Differential item functioning
Graded response model
Item response theory
Item response theory likelihood-ratio tests for differential item functioning
Low Vision Quality Of Life questionnaire
Weighted least squares with mean and variance correction.
Financial support was provided by: ZonMw-Inzicht (Netherlands Organisation for Health Research and Development-Insight Society, The Hague, Grant no. 943-03-017).
- Stelmack J: Quality of life of low-vision patients and outcomes of low-vision rehabilitation. Optom Vis Sci. 2001, 78: 335-342. 10.1097/00006324-200105000-00017.View ArticlePubMedGoogle Scholar
- Klaver CC, Wolfs RC, Vingerling JR, Hofman A, de Jong PT: Age-specific prevalence and causes of blindness and visual impairment in an older population: the Rotterdam Study. Arch Ophthalmol. 1998, 116: 653-658.View ArticlePubMedGoogle Scholar
- McCabe P, Nason F, Demers TP, Friedman D, Seddon JM: Evaluating the effectiveness of a vision rehabilitation intervention using an objective and subjective measure of functional performance. Ophthalmic Epidemiol. 2000, 7: 259-270. 10.1076/opep.7.4.259.4173.View ArticlePubMedGoogle Scholar
- de Boer MR, Twisk J, Moll AC, Volker-Dieben HJM, de Vet HCW, van Rens GHMB: Outcomes of low vision services using optometric and multidisciplinary approaches: a non-randomized comparison. Ophthalmic Physiol Opt. 2006, 26: 535-544. 10.1111/j.1475-1313.2006.00424.x.View ArticlePubMedGoogle Scholar
- van Nispen RMA, Knol DL, Langelaan M, de Boer MR, Terwee CB, van Rens GHMB: Applying multilevel item response theory to vision-related quality of life in Dutch visually impaired elderly. Optom Vis Sci. 2007, 84: 710-720. 10.1097/OPX.0b013e31813375b8.View ArticlePubMedGoogle Scholar
- Birk T, Hickl S, Wahl HW, Miller D, Kämmerer A, Holz F, Becker S, Völcker HE: Development and pilot evaluation of a psychosocial intervention program for patients with age-related macular degeneration. Gerontologist. 2004, 44: 836-843. 10.1093/geront/44.6.836.View ArticlePubMedGoogle Scholar
- Reeves BC, Harper RA, Russell WB: Enhanced low vision rehabilitation for people with age related macular degeneration: a randomised controlled trial. Br J Ophthalmol. 2004, 88: 1443-1449. 10.1136/bjo.2003.037457.View ArticlePubMedPubMed CentralGoogle Scholar
- Hinds A, Sinclair A, Park J, Suttie A, Paterson H, Macdonald M: Impact of an interdisciplinary low vision service on the quality of life of low vision patients. Br J Ophthalmol. 2003, 87: 1391-1396. 10.1136/bjo.87.11.1391.View ArticlePubMedPubMed CentralGoogle Scholar
- Finger R, Fleckenstein M, Holz F, Scholl H: Quality of life in age-related macular degeneration: a review of available vision-specific psychometric tools. Qual Life Res. 2008, 17: 559-574. 10.1007/s11136-008-9327-4.View ArticlePubMedGoogle Scholar
- van Nispen RMA, de Boer MR, van Rens GHMB: Additional psychometric information and vision-specific questionnaires are available for age-related macular degeneration. Qual Life Res. 2009, 18: 65-69. 10.1007/s11136-008-9425-3.View ArticlePubMedGoogle Scholar
- de Boer MR, Moll AC, de Vet HCW, Terwee CB, Volker-Dieben HJM, van Rens GHMB: Psychometric properties of vision-related quality of life questionnaires: a systematic review. Ophthalmic Physiol Opt. 2004, 24: 257-273. 10.1111/j.1475-1313.2004.00187.x.View ArticlePubMedGoogle Scholar
- Wolffsohn JS, Cochrane AL: Design of the low vision quality-of-life questionnaire (LVQOL) and measuring the outcome of low-vision rehabilitation. Am J Ophthalmol. 2000, 130: 793-802. 10.1016/S0002-9394(00)00610-3.View ArticlePubMedGoogle Scholar
- de Boer MR, de Vet HCW, Terwee CB, Moll AC, Volker-Dieben HJM, van Rens GHMB: Changes to the subscales of two vision-related quality of life questionnaires are proposed. J Clin Epidemiol. 2005, 58: 1260-1268. 10.1016/j.jclinepi.2005.04.007.View ArticlePubMedGoogle Scholar
- de Boer MR, Terwee CB, de Vet HCW, Moll AC, Volker-Dieben HJM, van Rens GHMB: Evaluation of cross-sectional and longitudinal construct validity of two vision-related quality of life questionnaires: the LVQOL and VCM1. Qual Life Res. 2006, 15: 233-248. 10.1007/s11136-005-1524-9.View ArticlePubMedGoogle Scholar
- van Nispen RMA, Knol DL, Neve JJ, van Rens GHMB: A multilevel item response theory model was investigated for longitudinal vision-related quality of life data. J Clin Epidemiol. 2010, 63: 321-330. 10.1016/j.jclinepi.2009.06.012.View ArticlePubMedGoogle Scholar
- Reeve BB, Hays RD, Chang C-H, Perfetto EM: Applying item response theory to enhance health outcomes assessment. Qual Life Res. 2007, 16: 1-3.View ArticleGoogle Scholar
- Massof RW: An interval-scaled scoring algorithm for visual function questionnaires. Optom Vis Sci. 2007, 84: 689-704.View ArticleGoogle Scholar
- Langelaan M, van Nispen RMA, Knol DL, Moll AC, de Boer MR, Wouters B, van Rens GHMB: Visual Functioning Questionnaire: reevaluation of psychometric properties for a group of working-age adults. Optom Vis Sci. 2007, 84: 775-784. 10.1097/OPX.0b013e3181334b98.View ArticlePubMedGoogle Scholar
- Lamoureux E, Pesudovs K, Pallant J, Rees G, Hassell JB, Caudle LE, Keeffe JE: An evaluation of the 10-item Vision Core Measure 1 (VCM1) scale (the Core Module of the Vision-related Quality of Life scale) using Rasch analysis. Ophthalmic Epidemiol. 2008, 15: 224-233. 10.1080/09286580802256559.View ArticlePubMedGoogle Scholar
- Embretson S, Reise S: Item response theory for psychologists. 2000, Mahwah, NJ: ErlbaumGoogle Scholar
- Tutz G: Sequential item response models with an ordered response. Brit J Math Stat Psychol. 1990, 43: 39-55. 10.1111/j.2044-8317.1990.tb00925.x.View ArticleGoogle Scholar
- van Engelenburg G: On psychometric models for polytomous items with ordered categories within the framework of item response theory. 1997, University of Amsterdam, the NetherlandsGoogle Scholar
- Akkermans LMW: Studies on statistical models for polytomously scored test items. 1998, University of Twente, the NetherlandsGoogle Scholar
- Skrondal A, Rabe-Hesketh S: Generalized latent variable modeling: multilevel, longitudinal, and structural equation models. 2004, London, UK: Chapman & Hall, 113-View ArticleGoogle Scholar
- Sass DA, Schmitt TA, Walker CM: Estimating non-normal latent trait distributions within item response theory using true and estimated item parameters. Appl Meas Educat. 2008, 21: 65-88. 10.1080/08957340701796415.View ArticleGoogle Scholar
- Orlando Edelen M, Reeve BB: Applying item response theory (IRT) modeling to questionnaire development, evaluation, and refinement. Qual Life Res. 2007, 16: 5-18. 10.1007/s11136-007-9198-0.View ArticleGoogle Scholar
- Crane P, van Belle G, Larson E: Test bias in a cognitive test: differential item functioning in the CASI. Stat Med. 2004, 23: 241-256. 10.1002/sim.1713.View ArticlePubMedGoogle Scholar
- Teresi J, Fleishman J: Differential item functioning and health assessment. Qual Life Res. 2007, 16: 33-42. 10.1007/s11136-007-9184-6.View ArticlePubMedGoogle Scholar
- Reeve BB, Hays RD, Bjorner JB, Cook KF, Crane PK, Teresi JA, Thissen D, Revicki DA, Weiss DJ, Hambleton RK, Liu H, Gershon R, Reise SP, Lai J, Cella D: Psychometric evaluation and calibration of health-related quality of life item banks. Plans for the Patient-Reported Outcomes Measurement Information System (PROMIS). Med Care. 2007, 45: S22-S31. 10.1097/01.mlr.0000250483.85507.04.View ArticlePubMedGoogle Scholar
- Samejima F: Estimation of latent ability using response pattern of graded scores. 1969, Psychometric Monograph Supplement No 17: Richmond, VA: William Byrd PressGoogle Scholar
- Thissen D: MULTILOG™ User's guide. Multiple, categorical item analysis and test scoring using item response theory. 1991, Chicago: Scientific Software Inc.Google Scholar
- Bjorner J, Christensen K, Orlando M, Thissen D: Testing the fit of item response theory models for patient reported outcomes. 2005, International Society for Quality of Life Research meeting abstracts. The QLR journal, P-151, Abstract #1676, [http://www.isoqol.org/2005ConfAbstracts.pdf]http://www.isoqol.org/2005ConfAbstracts.pdf Google Scholar
- Orlando M, Thissen D: Likelihood-based item-fit indices for dichotomous item response theory models. Appl Psychol Meas. 2000, 24: 50-64. 10.1177/01466216000241003.View ArticleGoogle Scholar
- Orlando M, Thissen D: Further examination of the performance of S-X2, an item fit index for dichotomous item response theory models. Appl Psychol Meas. 2003, 27: 289-298. 10.1177/0146621603027004004.View ArticleGoogle Scholar
- Teresi J, Ocepek-Welikson K, Kleinman M, Cook KF, Crane PK, Gibbons LE, Morales LS, Orlando-Edelen M, Cella D: Evaluating measurement equivalence using item response theory log-likelihood ratio (IRTLR) method to assess differential item functioning (DIF): applications (with illustrations) to measures of physical functioning ability and general distress. Qual Life Res. 2007, 16: 43-68. 10.1007/s11136-007-9186-4.View ArticlePubMedGoogle Scholar
- Thissen D: IRTLRDIF v.2.0b: Software for the computation of the statistics involved in item response theory likelihood-ratio tests for differential item functioning. 2001, Chapel Hill, NC: L.L. Thurstone Psychometric Laboratory, University of North Carolina at Chapel HillGoogle Scholar
- Thissen D: IRTLRDIF software. Accessed at 14 Sep 2010, [http://www.unc.edu/~dthissen/dl.html]
- Langer M, Hill C, Thissen D, Burwinkle T, Varni J, DeWalt D: Item response theory detected differential item functioning between healthy and ill children in quality-of-life measures. J Clin Epidemiol. 2008, 61: 268-276. 10.1016/j.jclinepi.2007.05.002.View ArticlePubMedGoogle Scholar
- Gustafsson J: The Rasch model for dichotomous items: Theory, applications and a computer program. 1977, (Internal Rep No. 63) Institute of Education, University of GoteborgGoogle Scholar
- Wolffsohn JS, Cochrane AL, Watt NA: Implementation methods for vision related quality of life questionnaires. Br J Ophthalmol. 2000, 84: 1035-1040. 10.1136/bjo.84.9.1035.View ArticlePubMedPubMed CentralGoogle Scholar
- van Nispen RMA, Knol DL, Mokkink LB, Comijs HC, Deeg DJH, van Rens GHMB: Vision-related quality of life Core Measure (VCM1) showed low-impact differential item functioning between groups with different administration modes. J Clin Epidemiol. 2010, 63: 1232-1241. 10.1016/j.jclinepi.2009.12.010.View ArticlePubMedGoogle Scholar
- Schwartz N, Strack F, Hippler H, Bishop G: The impact of administration mode on response effects in survey measurement. Appl Cognitive Psychol. 1991, 5: 193-212. 10.1002/acp.2350050304.View ArticleGoogle Scholar
- Raju NS, Oshima TC: Two prophecy formulas for assessing the reliability of item response theory-based ability estimates. Educat Psychol Meas. 2005, 65: 361-10.1177/0013164404267289.View ArticleGoogle Scholar
- Samejima F: Estimation of reliability coefficients using the test information function and its modifications. Appl Psychol Meas. 1994, 18: 229-10.1177/014662169401800304.View ArticleGoogle Scholar
- The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1471-2288/11/125/prepub