Measuring health-related quality of life in cervical cancer patients: a systematic review of the most used questionnaires and their validity

Background Data on health-related quality of life (HRQoL) is paramount for shared and evidence based decision-making. Since an overview of cervical cancer HRQoL tools and their validity appears to be lacking, we performed a systematic review on usage of disease specific HRQoL instruments in cervical cancer patients and their psychometric properties to identify the most suitable cervical cancer specific HRQoL tool. Methods We searched Pubmed, EMBASE and PsycINFO from inception up to 18 October 2016 for studies on quality of life in cervical cancer patients. Data extraction and HRQoL identification was performed by two independent reviewers. Validation studies of the identified cervical cancer specific HRQoL tools were retrieved and assessed on psychometric properties using the COSMIN checklist. All used cervical cancer specific HRQoL instruments were scored and ranked according to their psychometric properties. Results We included 156 studies (20,690 patients) and identified 31 HRQoL tools. The EORTC QLQ-CX24 (35 studies; 5,556 patients) and FACT-Cx (22 studies; 4,224 patients) were the only cervical cancer specific tools. The EORTC QLQ-CX24 had 4 out of 9 positive rated psychometric properties; internal consistency, content and construct validity, and agreement. Criterion validity, reliability, and interpretability scored doubtful. Responsiveness and floor- and ceiling effects were not reported. The FACT-Cx had 2 out of 9 positive rated psychometric properties; internal consistency and agreement. Content validity, reliability, and interpretability scored doubtful while criterion and construct validity scored negative. Responsiveness and floor- and ceiling effects were not reported. Conclusion The validity of the often used EORTC QLQ-CX24 questionnaire for cervical cancer patients remains uncertain as 5 out of 9 psychometric properties were doubtful or not reported in current literature. Cervical cancer specific HRQoL tools should therefore always be used in conjunction with validated generic cancer HRQoL tools until proper validity has been proven, or a more valid tool has been developed. Electronic supplementary material The online version of this article (doi:10.1186/s12874-016-0289-x) contains supplementary material, which is available to authorized users.


Background
Treatment of the fourth most common cancer in women, cervical cancer, consists of either surgery and/ or (chemo-)radiotherapy, based on the FIGO stage of the disease [1,2]. Depending on the treatment, different side effects, such as: bladder, bowel, and vaginal dysfunction, lymphedema, and lymphocysts can occur [3,4]. These side effects, together with the emotional and social impact of the disease, influence a patients' health-related quality of life (HRQoL), even when survival is extended. The acceptance of medical treatments is critically dependent on these HRQoL consequences, making it one of the most important parameters in the evaluation of medical treatments.
Quality of life is a complex, multidimensional construct, with a range of conceptual definitions and is often evaluated using HRQoL tools. There is general agreement that multidimensional HRQoL assessment should at least include physical, social and psychological/emotional functioning and well-being [5]. The validity and suitability of such HRQoL tools, is represented by their psychometric properties. Psychometric properties indicate if a measurement tool is; free of error (reliability), assesses what it is intended to measure (validity), is able to detect change in an individual over time (responsiveness), and the degree to which one can assign qualitative meaning to quantitative scores (interpretability) [6]. Using an instrument with (good) psychometric properties that have been evaluated enables the user to draw more robust and substantial conclusions. Since the psychometric properties of a measurement tool can differ per target population, it is recommended that they are evaluated in that specific target population.
Others have studied the psychometrics and appropriateness of HRQoL tools in gynaecologic oncology in general [7][8][9]. However, both a clear overview of the available disease specific HRQoL tools for cervical cancer patients and an appraisal of their psychometric properties are lacking. Vistad et al [10] reported on the impact of cervical cancer on HRQoL, and critically appraised a number of studies regarding HRQoL measurement in this population. However, they did not report or evaluate the psychometric properties of the HRQoL instruments for cervical cancer. Furthermore, FIGO stage is important in the treatment of cervical cancer, and can result in different side effects influencing relevant aspects of HRQoL [2][3][4]11]. As such, the validity, reliability, and responsiveness of an HRQoL tool can differ per disease stage. For a valid and patientcentered evaluation of health status, it is important that the HRQoL tool measures aspects of health status that are important to patients with cervical cancer, and that the measurement characteristics are adequate for the specific patient population. Thus, we hypothesized that cervical cancer-specific HRQoL tools will provide the most valid and patient-centered evaluation of health status.
The aim of this study therefore is to provide an overview of the used HRQoL tools in cervical cancer patients, to identify cervical cancer specific HRQoL tools and to assess their psychometric properties. This allows for an evidence based choice of a cervical cancer specific HRQoL tool in both clinical practice and clinical trials.

Data source and search
We systematically searched EMBASE, Pubmed, and Psy-cINFO from inception up to 18 October 2016 for studies on quality of life assessment in cervical cancer patients. The search strategy combined synonyms for cervical cancer, questionnaires and quality of life, see Additional file 1 for the complete search strategy. All citations were imported into the bibliographic database of EndNote X5 (Thomas Reuters, New York, NY, USA).

Study selection
After retrieving all the records in Endnote, duplicates were removed and records were screened on title and abstract for relevance by two independent reviewers (M.E.S and C.T.). Inclusion criteria for full text assessment were; (1) assessing quality of life in (2) patients with cervical cancer using (3) an HRQoL tool [5], and (4) availability of a full text (5) peer reviewed article. For example, studies focusing on a single quality of life domain were excluded as the concept of HRQoL is multidimensional. Whenever a full text questionnaire was not available, the corresponding authors were contacted for a copy in order to assess if the questionnaire did meet the HRQoL definition. In case of disagreement, a third reviewer was consulted (M.M.R.). All studies that were included from the systematic review were documented as supplemental references and contain the prefix 's' , followed by the respective reference number.

Data extraction, synthesis and analysis
Two independent reviewers (M.E.S. and C.T.) extracted the following data: HRQoL tool, number of cervical cancer patients, and their respective FIGO stage (Additional file 2). An overview was made of all used HRQoL tools with the number of included patients. A distinction was made between HRQoL tools for the following domains: generic HRQoL, HRQoL for cancer in general, cervical cancerspecific HRQoL, other cancer-specific HRQoL, and other non-cancer but disease or symptom specific HRQoL tools. Depending on the data presentation, we defined the following stages as early stage cervical cancer; stage I, OR stage IA, IB and IIA, OR stage IA1 + 2, IB1 and IIA1.
When two or more HRQoL tools were used in one study, each HRQoL tool was included either in combination or as separate tool, based on their use. Thus, the reported number of studies and/or patients can exceed the total overall included number of studies and/or patients from the systematic search. In case of disagreement, a third reviewer was consulted (M.M.R).

Psychometric property assessment
As we hypothesized that cervical cancer-specific HRQoL tools will provide the most valid and patient-centered evaluation of health status, we only assessed psychometric properties of the identified cervical cancer-specific HRQoL tools. Psychometric property assessment of the cervical cancer specific HRQoL tools was based on all available studies in which one or more psychometric properties of the tool were assessed and reported for cervical cancer patients. These studies were identified through the references of studies that were already included after the first search for HRQoL tools used in cervical cancer patients and by searching the official website of the specific HRQoL tool. Furthermore, we also searched Embase, Pubmed, and Psy-cINFO using a search strategy that combined synonyms and terms for cervical cancer, validation studies/psychometrics and quality of life (Additional file 3). We also performed a reference and related article search.
The psychometric properties were assessed according to the COSMIN (COnsensus-based Standards for the selection of health Measurement Instruments) criteria published by Terwee et al. [12] including content validity, internal consistency, criterion validity, construct validity, reproducibility (agreement, reliability), responsiveness, floor-and ceiling effects, and interpretability. A scoring model was used based on a positive (+), doubtful (?), or negative (-) rating that was given to each psychometric property [13]. If more than one validation study assessed the same psychometric property, the best rating was used as recommended by the COSMIN protocol. Unfortunately, there are no methods available to pool results on psychometric property testing from different validation studies while taking their underlying methodological quality (weight) into account. We therefore reported all ratings in order to provide a clear overview of the best rating and the variation between validation studies for each psychometric property. If no information was found on the psychometrics, it was not assessable and was scored with an "X". See Additional file 4 for the definition of the psychometric properties and their scoring criteria.
The ratings were not used for a total sum score per HRQoL tool as each individual psychometric property can have its own weight regarding the quality and the suitability of the cervical cancer specific HRQoL tools [12]. Figure 1 provides an overview of the literature search and study selection. Our literature search yielded 2184 unique records, of which 320 remained after screening titles and abstracts. The full-text of these studies was reviewed for eligibility. Studies were excluded for the following reasons: not including an HRQoL assessment (89), duplicates (23), non-cervical cancer patients (21), validation study (13), review (12), a cost-effectiveness study (4), and no full text copy of the questionnaire available (2). This yielded 156 studies (20,690 patients) using 31 different HRQoL tools. See Additional file 5 with the supplemental references for a list of all included studies and their respective reference number, with the prefix 's'.
The two cervical cancer specific HRQoL tools used were the EORTC QLQ-CX24 and FACT-Cx, which were used in 35 and 22 studies including 5,556 and 4,224 patients, respectively. In 3 studies (n = 574 patients) the EORTC QLQ-CX24 was combined with another, mostly a more generic HRQoL tool ( Table 2). The FACT-Cx was combined with another HRQoL tool in 8 studies (n = 1,915 patients). The EORTC QLQ-CX24 was used in 68% early FIGO stage cervical cancer patients as compared to 29% with the FACT-Cx, Table 3 shows the psychometric properties of the EORTC QLQ-CX24 and FACT-Cx. For the EORTC QLQ-CX24 and FACT-Cx, 7 and 3 validation studies were available with study populations ranging from 100 to 860 patients.

EORTC QLQ-CX24
There is positive evidence regarding the content validity, construct validity, internal consistency, and agreement since the items in the questionnaire were extensively selected involving both patients and investigators; the scores of the tool were related to the treatment status as hypothesized prior to the study; Cronbach's α was above 0.70; and the test-retest showed an ICC between 0.85 and 0.89, respectively. The criterion validity and reliability are uncertain as a questionable reference standard was used to validate the tool's score to a reference standard and/or inappropriate statistical methods were used. These methods included: calculating a Cohen's D, Kruskall Wallis, Mann-Whitney, Wilcoxon, students' t-test, or ANOVA with a p-value between the subgroups to prove that patients could be distinguished from each other in subgroups, such as early and advanced FIGO stage or treatment status, based on the tool's score. The interpretability of the EORTC QLQ-CX24 score is limited as a minimal important change was not defined. Responsiveness and floor-and ceiling effects were also not assessed in any of the validation studies. The scoring model resulted in 4 positive, 3 doubtful, and 2 not assessable psychometrics for the EORTC QLQ-CX24, out of a maximum of 9.

FACT-Cx
There is positive evidence regarding the internal consistency and agreement as items per (sub)scale were correlated with a Cronbach's α above 0.70 and the testretest showed an ICC between 0.68 and 0.84. However, there is no evidence available regarding the content validity, i.e. it is unclear how the questionnaire was built and how the specific questions were selected. The reliability of the FACT-Cx is uncertain as again, inappropriate statistical methods were used to prove that patients could be distinguished from each other in subgroups such as early and advanced FIGO stage or treatment status, based on the tool's score. The interpretability of the FACT-Cx score is limited as a minimal important change was not defined. Responsiveness and floor-and ceiling effects were also not assessed in any of the validation studies. The criterion and construct validity were limited as the correlation α with the studied reference standard (SF-36) was below 0.70 and less than 75% of the hypotheses on how the scores of the questionnaire would relate to other measures in a manner that was consistent with theoretically derived hypotheses were confirmed.
The scoring model resulted in 2 positive, 3 doubtful, 2 not assessable, and 2 negative psychometrics for the FACT-Cx, out of a maximum of 9.
All validation studies included early and advanced stage cervical cancer patients and patients in these subgroups could be distinguished, based on their overall  scores. However, the psychometric properties per subgroup were not reported and thus not assessable for neither the EORCT QLQ-CX24 nor the FACT-Cx.

Discussion
The FACT-Cx and EORTC QLQ-CX24 were the identified cervical cancer-specific HRQoL tools, which were used in 22 and 35 out of 156 studies, respectively. The EORTC QLQ-CX24 appears to be the most used and a more appropriate tool to assess HRQoL in cervical cancer patients. However, its validity is uncertain since 5 out of 9 psychometric properties are doubtful or not reported in current literature. For example, no correlation was found between the performance of the tool and a reference standard, the minimal important change that should be detected was not defined, and floor and ceiling effects were not reported. The validity of the FACT-Cx is even more uncertain as 7 out of the 9 psychometric properties were doubtful or not reported at all. Similar problems as with the EORTC were reported; there was no correlation found between the performance of the tool and a reference standard, the minimal important change that should be detected was not defined, and floor and ceiling effects were not reported. But for the FACT-Cx there was also no description on how the questionnaire and its items were selected, hypotheses regarding the scores were not confirmed, and it remained unclear if repeated measurements over a longer period of time can detect a (relevant) change in quality of life. Thus the EORTC QLQ-CX24 has been more thoroughly assessed regarding its psychometric properties and scored better (both regarding the number of positively rated psychometric properties and the score per psychometric property) when compared to the FACT-Cx.
To our knowledge, this is the first systematic review that identified different HRQoL tools that have been used to assess quality of life in patients with cervical cancer. We used the COSMIN checklist for a thorough evaluation of the quality of the two cervical cancer specific HRQoL tools. We have provided evidence that the EORTC QLQ-CX24 is the most appropriate and valid cervical cancerspecific HRQoL for a patient-centered evaluation of health status of cervical cancer patients in general.
Our study has a few limitations. First, one possible HRQoL tool, the SES-QOL, could not be included as there was no full text copy of the tool available. Despite repeated requests, we did not receive a full text copy  a The QLQ-C30 core questionnaire did not count as a combination b The FACT-G core questionnaire did not count as a combination and had to exclude this tool from our results as we could not assess whether it meets the HRQoL definition.
On the other hand, as the SES-QOL is not a cervical cancer specific HRQoL tool it would not have influenced our final conclusion. Second, almost all psychometric properties highly depend on methods and design of the validation study. For instance, a property such as construct validity could score positively if only one hypothesis was tested and confirmed (>75% confirmed), while rated negatively when another hypotheses was tested but rejected (<75% confirmed). Thus the rating of an HRQoL tools' validity could therefore also be a representation of the validation study design, instead of the actual validity. Regardless, the uncertainty surrounding the validity of cervical cancer-specific HRQoL tools remains, and more evidence is needed to reduce this uncertainty.
Do note that the scoring model that we applied cannot be used to calculate an 'overall' score as the weight of each individual psychometric property can differ per specific design, application and/or study population, e.g. construct validity, reliability for discriminating between different (sub)groups, and responsiveness for the evaluation of treatment effects [12]. Thus, an external comparison on validity across HRQoL tools is only possible per psychometric property and not on an 'overall' score.
Third, the psychometric properties in the identified studies were often not reported, incorrect or unclear, and the terminology and definitions differed from those proposed by Terwee et al [12]. Regardless, this is the most up-to-date overview of current available literature and it should be noted that the absence of evidence on the above mentioned psychometric properties, either due to no reported data or inappropriate design/methods, is not to be confused with evidence of their absence. ?; doubtful design/methods thus no quantitative or qualitative scoring possible X; not reported therefore no quantitative or qualitative scoring possible Based on our results, the use of only a cervical cancer-specific HRQoL tool is not preferred since its validity remains to be proven. We therefore recommend to always use a well-validated generic HRQoL tool to be able to ascertain the most valid and patient-centered evaluation of health status, both in clinical practice and clinical trials. In addition, by using already well-validated generic HRQoL tools in combination with one of the cervical cancer-specific tools, researchers will be able to properly assess the psychometric properties of the FACT-Cx and EORTC QLQ-CX24. Another option could be to develop a new and more valid HRQoL tool. However, this may be redundant as there remains uncertainty regarding the validity of already available HRQoL tools and the absence of evidence on their validity should not to be confused with evidence of absent valid tools. For both the validation of current HRQoL tools and development of a new tool, we would recommend to use an established protocol, such as the quality criteria presented by Terwee et al [12]. A data presentation with an assessment of psychometric properties for both early and advanced stage cervical cancer is warranted as their treatment and subsequent possible side effects differ distinctively [14][15][16][17][18][19][20][21].

Conclusion
The validity of the often used EORTC QLQ-CX24 questionnaire for cervical cancer patients remains uncertain since 5 out of 9 psychometric properties were doubtful or not reported in current literature. Cervical cancer specific HRQoL tools should therefore always be used in conjunction with validated generic cancer HRQoL tools until proper validity has been proven, or a more valid tool has been developed.