The impact of disease-related impairments on disability and health-related quality of life: a systematic review

Background To investigate the interchangeability of measures of disability and health-related quality of life (HRQL) by comparing their associations patterns with disease-related impairment measures in patients with a variety of conditions. Methods A systematic literature search of MEDLINE, EMBASE, Web of Science and a hand search of reference lists through January 2006. Studies were included if they reported associations patterns between impairment and disability and between impairment and HRQL. Correlation coefficients were transformed to Fisher's z effect size (ES(z)). Weighted averages were reported as pooled ES(z) with 95% confidence intervals (CI). Results The relationship between impairment and disability was stronger (pooled ES(z) = 0.69; 95% CI, 0.66 – 0.72) than between impairment and HRQL (pooled ES(z) = 0.38; 95% CI, 0.36 – 0.41). The physical component score (pooled ES(z) = 0.43; 95% CI, 0.39 – 0.47) and disease-specific HRQL (pooled ES(z) = 0.46; 95% CI, 0.40 – 0.51) were stronger associated with impairments than the mental component score (pooled ES(z) = 0.28; 95% CI, 0.20 – 0.36) and generic HRQL (pooled ES(z) = 0.36; 95% CI, 0.33 – 0.39). Conclusion This study shows measures of disability and different HRQL domains were not equally related to impairment. Patient's impairments are better reflected in disability measures, than in HRQL instruments. There are many outcomes of interest and precisely defining them and measuring them will improve assessing the impact of new interventions.


Background
Choosing an outcome measure for use in clinical research is a complex process. If the outcomes are chosen inappropriately, a study may provide unreliable results [1]. Frequently used outcomes measures are mortality rates, number of events (recurrent myocard infarction) or disease activity (lesion load on MRI).
Besides biological measures other levels of clinical measurement can be considered in clinical studies: impairments, disability and health-related quality of life (HRQL). Impairments are the direct organic manifestations of the disease such as consciousness and paresis [2,3]. Disability can be defined as limitations in carrying out activities of daily living, such as self-care, mobility and activities inside or outside the home [2,3]. HRQL refers to a broad spectrum of consequences of disease. Although this concept also includes elements of impairments and disabilities, it has a strong focus on patient's social functioning and perceived health status and well being [4,5]. Hence, impairment measures are closely related to the patient's disorder, whereas the other outcome measures focus on the patient's level of functional health. Each subsequent level of clinical measurement is increasingly less disease-specific and more relevant to the patient.
Recently, patient-relevant outcomes in terms of disability and HRQL have become more important [6]. Although both functional outcomes reflect the consequences of diseases on personal level, they are conceptually different and not synonymous as is often thought by clinical researchers. Consequently, the decision to use one of these measures can have important implications for the interpretation of the study results. The objective of this systematic review is to investigate the interchangeability of measures of disability and HRQL by comparing their association patterns with disease-related impairment measures.

Study selection
A set of explicit criteria, composed by three investigators (NW, RdH, MV), were used for selection of the literature. Articles were included if they focused on methodological or metric aspects of patient-based outcomes (for example, methods of evaluating such measures, psychometric or clinimetric assessment of measures, comparative studies of measures).
Studies were included in which both the association between impairment and disability, and between impair-ment and HRQL were calculated by means of correlation coefficients or other association measures, regardless of the disease with the exception of psychiatric disorders. Hence, to improve the comparability between the correlation patterns between the different health concepts, only studies were included in which the three types of health outcomes were assessed in the same patient population. The measurement instruments were questionnaires or observation lists. When a multi-scale questionnaire focuses on more than one concept we only included the sub-scale for the health domain in question. In case of double publication we selected the first publication in time. Excluded were studies which valuated HRQL in terms of utilities [7] or composite scores of conceptual different measures.

Data extraction
All data were independently abstracted by two investigators (NW, RdH) through use of a list of the criteria composed by three investigators. After study selection, measures were categorized according the definitions of the WHO International Classification of Impairments, Disabilities, and Handicap version [2] and the revised International Classification of Functioning [3]. Outcome measures were classified into one of the following categories: impairment, disability, and HRQL. The latter was further categorized into generic HRQL, disease-specific HRQL, mental component of HRQL and physical component of HRQL. When there was doubt on the classification of the health domain a third reviewer (MV) was consulted.
A first draft synthesizing the data was produced by the first author of this review (NW) and extensively criticized by two other authors. The following information was abstracted: author, year of publication, sample size, disease, type of impairment and disability scales, distinction between generic or disease-specific and mental or physical HRQL, and the associations with impairment. Studies reporting repeated measures only baseline data was abstracted. Disagreements in abstracting were solved by discussion.
ES(z) = 1/2 log e [1+r/ 1-r] were the asymptotic variance was: v ES(z) = 1/n-3 Mean ES(z) were calculated for studies reporting multiple correlations between the concepts studied [9]. The ES(z) were combined using a fixed or random effects model, when appropriate. The weight for each study was inversely proportional to the square root of the number of patients included in the study. The methods for obtaining the pooled ES(z) and its 95% confidence interval (CI) are described in detail elsewhere [10]. We labeled the strength of the ES 0.20 as small, 0.50 as medium and 0.80 as large [11]. The pooled ES(z) were converted back to a correlation coefficient [12]: in which we labeled the strength of the association; absolute values of r, 0.00-0.19 is regarded as very weak, 0.20-0.39 as weak, 0.40-0.59 as moderate, 0.60-0.79 as strong and 0.80-1.00 as very strong correlation [13]. The backward transformed correlation coefficients were also expressed in variance explained (r 2 ). The square of the correlation coefficient tells us what proportion of the variance of one variable can be explained by the linear correlation with the other.
Statistical heterogeneity was measured using the Cochrane Q statistics; p ≤ 0.10 was considered representative of significant statistical heterogeneity. To establish the effect of heterogeneity between studies on meta-analysis conclusions, we performed several subgroup analyses: impairment in relation to generic or disease-specific HRQL and impairment versus the physical or mental aspects of HRQL.
We constructed normal quantile plots to assess whether the ES(z) estimates are (approximately) normal distributed and to search for publication bias [14]. With regard to the presence of publication bias we also used the Spearman's rho rank correlation method [12] to test the relationship between the standardized ES(z) and sample size (p < 0.05 indicates significant publication bias) and a failsafe number approach. A fail-safe number is the number of non-significant, unpublished, or missing studies that would need to be added to the meta-analysis in order to change the results of the analysis. If this number is relative large to the number of observed studies, the observed results, even with some publication bias, can be treated as a reliable estimate of the true effect [15]. All analyses were performed in Meta-Win, version 2.0.

Study characteristics
Our initial search yielded 434 potentially literature citations ( Figure 1). Of these, 125 were excluded based on title (70 were investigating a psychiatric disorder, 44 were review articles, nine articles measured caregiver burden, one article reported data about a composite score and one reported HRQL in terms of utilities). Abstracts from 309 articles were retrieved and additional 188 articles were excluded (134 studies did not measure all three health domains, 49 did not report concrete associations between the health domains, three were investigating a psychiatric disorder, one reported data about a composite score and one reported HRQL in terms of utilities), leaving 121 articles for full review. Of these, 90 were excluded (44 did not report association patterns, 34 articles measured not all three domains, in five articles impairment measures were no scale score, four articles used composite scores, one reported utility based HRQL, one concerned a proxy measure and one a double publication).
A total of 31 studies, with an average sample size of 401 (range 38 -6497) persons, were found to conform to our inclusion criteria   (Table 1). Seven studies reported data on outcome measures in patients with arthritis, six on Multiple Sclerosis, three on Parkinson's disease, two on stroke, two on spinal cord injury, four on a variety of joint diseases (knee, patella, elbow, shoulder), one cervical dystonia, one diabetic neuropathy, one facial nerve paralysis, one AIDS, one chronic dysphonia, one chronic obstructive pulmonary disease, and one on psoriasis.
The studies included a wide range of disease-related impairment measures, as well as a variety of disability measures (  [37,42] the mental health dimension was considered as mental HRQL. Other scales that were considered as a mental component of HRQL were the psychosocial dimension of the Sickness Impact Profile (SIP) [21,22,34], psychological well-being and the General Health Questionnaire (GHQ) [26], the mental dimension of the Multiple Sclerosis HRQL scale (MSQL) [36], the social well-being dimension of the Facial Disability Index (FDI) [28], the Overall Rating of Life satisfaction (ORLS) [20], and the emotional subscale of the Voice Handicap Index (VHI) [43].
Overall the studies included 86 correlations between impairment and disability and 104 correlations between impairment and HRQL. Five studies reported only one single correlation [24,25,28,33,44], for the remaining studies mean ES(z) were calculated for each health domain. Table 2 presents effect sizes of the 31 studies included. Thirteen studies (42%) reported large ES(z) between impairment and disability (≥ 0.81), whereas 81% of the ES(z) related to HRQL was small to moderate (≤ 0.50). The ES(z) between impairment and the different types of HRQL were on average small to moderate, except for disease-specific HRQL; 50% of the studies reported moderate to large ES(z) (0.51 -0.80).

Data synthesis
Review of the normal quantile plots (Figure 2 and 3) showed that one study [46] substantially deviated from normality. Removal of this study revealed statistical homogeneity of the datasets for the association between impairment and disability (Q statistics, p = 0.34), and between impairment and HRQL (Q statistics, p = 0.15).

Publication bias
Review of normal quantile plots showed no clear indications for publication bias in both the analyses for the relationship impairment and disability and for impairment and HRQL (Figure 2 and 3). In addition, publication bias was not evident when the Spearman's rho rank correlation method was used (both p > 0.50). The fail-safe calculation by Rosenthal's method retrieved for both analyses a large number of studies (n = 27077/n = 9755) needed to be added in order to change the results of the meta-analysis.

Discussion
In this study we investigated the impact of impairments on patients' functional health in a spectrum of diseases. The results of the meta-analysis show that impairments explain 36% of the variance of the disability scores and 13% of the variance of HRQL scores. Impairment scales account for 17% of the physical dimension of HRQL, but can explain no more than 7% of the psychological aspects of HRQL [47].
Due to the great variability in disease-related impairment scales and disability and HRQL instruments used in patient groups with a wide range of disorders, we expected heterogeneous data sets. Interestingly, after removal of Summary description of the studies included and excluded in the meta-analysis Figure 1 Summary description of the studies included and excluded in the meta-analysis.   one study which deviated from normality, homogeneity could be demonstrated in all analyses, except when we investigated the relationship between impairment and the mental component of HRQL. Possibly, this type of HRQL is less clearly defined compared to the other types of functional outcomes.
Treatments, such as new drugs or new surgical or radiological interventions, are aimed at reducing mortality and morbidity. If these treatments reduce impairments, the beneficial effects on functional health will be far more better detectable in the patient's level of disability than in his or her level of HRQL. Moreover, if the outcome is assessed in terms of HRQL, treatment effects will be more visible in the physical component than in the mental components, and will be better reflected in the diseasespecific measures than in the generic HRQL measures. Therefore, outcome measures at the level of functioning of patients should be selected carefully when clinical studies are designed. It clearly does matter on which level the functional outcome measurement is chosen.
No doubt, HRQL scales represent an important measure from a patient's point of view, but have serious disadvantages in many clinical efficacy studies. The extent to which patients fulfill social roles and participate in their environment cannot be observed directly and depends not only on their functional ability, but also on personal traits, social circumstances and societal barriers. In particular, HRQL scores, which reflect the patient's perception of the consequences of disease, depend on numerous additional, usually psychosocial, factors other than the disease itself. This explains why the correlation between impairments and HRQL scores is weak, especially when emphasis is laid on subjective psychological scores of these scales. The subjective HRQL scores therefore are often beyond the influence of the physician. In the primary assessment of new drugs or new surgical procedures, there  29 Beaton, 1996 [18] 0.40 0. 28 Lindeboom, 1998 [30] 0.23 0. 21 Vinik, 2005 [44] 0.66 0.59 Kahn, 2001 [28] 0.87 0.78 0.78 Cleary, 1993 [20] 0.60 0.38 0. 38 Speyer, 2004 [43] 1.05 0.55 0. 44 Hu, 2005 [27] 0.46 0.45 0.29 0.60 Zachariae, 2002 [46] 0.44 0.40 are many outcomes of interest and precisely defining them and measuring them will improve assessing the impact of new interventions. In contrast, disability (mobility, self-care) seem to be a more appropriate primary end point for assessing the functional health status of patients. The ability to perform basic and complex activities of daily life is, as we showed, not only better related to the disease process itself, but is also observable and predictive of dependence [48], and forms an essential aspect of the patient's HRQL [49], an aspect which is amenable for treatment.
A shortcoming of this meta-analysis might be that there are probably more studies than the 31 we found. Our search strategy was based on the ICIDH and ICF terms and few researchers distinguish between the terms impairment, disability and HRQL. If they distinguished between these terms correlations between impairment and disability or between impairment and HRQL were given rarely for all three domains together. Another point of discussion may be the concern for publication bias. However, the normal quantile plots, Spearman's rank correlation coefficient method and fail-safe calculation method do not give an indication for this type of bias. Finally and unfortunately, we were not able to perform subgroup analysis based on the methodological quality of the studies. This because the psychometric and clinimetric articles analyzed used a hotchpotch of research designs which made it impossible to formulate unequivocal methodological criteria in terms of, for example, appropriateness of Normal quantile plot for the ES(z) between impairment and disability Figure 2 Normal quantile plot for the ES(z) between impairment and disability. The plot compares two distributions by plotting there quantiles against each other. One of the distributions is the standard normal distribution (X as) while the other is the distribution one wishes to compare it against (here the Y as is defined as ES(z) i /√ i ). The deviations from linearity indicate deviations in normality. Publication bias will tend to leave strange gaps in the plot or lead to strange curves.
the design chosen, sample size consideration, blind assessment of outcome parameters, statistical adjustment for confounders, or loss to follow-up [50].

Conclusion
New drugs or new surgical or radiological treatments are developed from a pathophysiologic perspective and are primarily directed at reducing impairments. Traditionally, biological and impairment measures are preferred. Increasingly, it becomes evident that this type of outcome is not always relevant for patients [21,51]. Our study shows that patient's impairments are clearly better reflected in disability measures, than in HRQL instruments. The relatively low association between impair-ment and HRQL is found in patients with various disorders, therefore we conclude that to assess the efficacy of a new treatment, disability is an important functional endpoint in clinical studies.