Impact of comorbidity assessment methods to predict non-cancer mortality risk in cancer patients: a retrospective observational study using the National Health Insurance Service claims-based data in Korea

Background Cancer patients’ prognoses are complicated by comorbidities. Prognostic prediction models with inappropriate comorbidity adjustments yield biased survival estimates. However, an appropriate claims-based comorbidity risk assessment method remains unclear. This study aimed to compare methods used to capture comorbidities from claims data and predict non-cancer mortality risks among cancer patients. Methods Data were obtained from the National Health Insurance Service-National Sample Cohort database in Korea; 2979 cancer patients diagnosed in 2006 were considered. Claims-based Charlson Comorbidity Index was evaluated according to the various assessment methods: different periods in washout window, lookback, and claim types. The prevalence of comorbidities and associated non-cancer mortality risks were compared. The Cox proportional hazards models considering left-truncation were used to estimate the non-cancer mortality risks. Results The prevalence of peptic ulcer, the most common comorbidity, ranged from 1.5 to 31.0%, and the proportion of patients with ≥1 comorbidity ranged from 4.5 to 58.4%, depending on the assessment methods. Outpatient claims captured 96.9% of patients with chronic obstructive pulmonary disease; however, they captured only 65.2% of patients with myocardial infarction. The different assessment methods affected non-cancer mortality risks; for example, the hazard ratios for patients with moderate comorbidity (CCI 3–4) varied from 1.0 (95% CI: 0.6–1.6) to 5.0 (95% CI: 2.7–9.3). Inpatient claims resulted in relatively higher estimates reflective of disease severity. Conclusions The prevalence of comorbidities and associated non-cancer mortality risks varied considerably by the assessment methods. Researchers should understand the complexity of comorbidity assessments in claims-based risk assessment and select an optimal approach. Supplementary Information The online version contains supplementary material available at 10.1186/s12874-021-01257-2.


(Continued from previous page)
Conclusions: The prevalence of comorbidities and associated non-cancer mortality risks varied considerably by the assessment methods. Researchers should understand the complexity of comorbidity assessments in claims-based risk assessment and select an optimal approach.
Keywords: Comorbidity, Cancer, Claims data, Charlson comorbidity index, Non-cancer, Mortality, Prognosis prediction Background Advances in cancer diagnosis and treatment have extended the life expectancy in cancer patients and increased cancer survivors. However, these survivors now face an increased risk of non-cancer-related death [1] due to comorbidities and complications associated with cancer treatments [2,3].
The prognosis prediction tools that incorporate comorbidity may be useful in facilitating clinical decisions and understanding how the patient's comorbidity affects survival outcomes further [3,4]. Therefore, the assessment of comorbid conditions and their impact on non-cancer mortality risks are critical to cancer prognostication. Further, predictive models not adjusted for comorbidities may yield biased survival estimates. Because many cancer registries do not record patients' comorbid conditions before a cancer diagnosis, administrative claims data are often considered data sources for comorbid conditions [5]. Indeed, comorbid conditions assessment from the claims data can also be used for health service planning or population health monitoring. Population-based claims data can primarily provide a more representative and comprehensive picture of the health status of cancer patients.
The Charlson Comorbidity Index (CCI) is a widely used metric and developed to account for 19 comorbid conditions in medical records [6]. Several diagnostic coding algorithms, which use the International Statistical Classification of Diseases and Related Health Problems (ICD) codes, have been developed to extract information on CCI conditions from claims-based health care data. In 1992, Deyo et al. adapted the CCI to the International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM), using inpatient claims only [7]. Subsequently, the diagnostic codes were  [8][9][10]. Of these, the diagnostic coding system devised by Quan et al. demonstrated a superior median discriminative ability to predict the overall mortality risk [11]. The ongoing shift toward delivering health care services in outpatient settings has led to increases in the prevalence of comorbidities observed therein. In 2000, Klabunde et al. developed a comorbidity index that accounted for comorbid conditions in outpatient claims and simultaneously used the rule-out algorithm to prevent the up-coding of these claims [12].
Most studies measured comorbid conditions based on practical considerations such as convenience, experience, and data availability rather than an empirical evaluation of comorbidity risks [19,20]. Further, no previous studies conducted a systematic assessment of the impact of different comorbidity assessment methods on the estimates of non-cancer mortality risks among cancer survivors. This study demonstrated the potential issues of ascertainment periods and claim types in capturing comorbid conditions using the administrative claims data. Moreover, we aimed to evaluate the effect of different comorbidity assessment methods on the prevalence of comorbidities and their associated non-cancer mortality risk.

Data sources
We used de-identified secondary data of a population-based sample cohort of 1,000,000 participants established by the National Health Insurance Service (NHIS) in Korea [24]. A representative sample cohort, comprising 2% of the total eligible Korean population in 2006, was selected randomly and followed until 2015 (10 years). The database contains data on the demographics, medical aid, medical bills, medical treatments, and prescriptions, which were retrospectively collected from 2002 until 2015. The database is linked to the mortality and cause of death statistics provided by the Korea National Statistical Office, follow-up through Dec 31, 2015.

Study population
We identified cancer patients diagnosed in 2006 using ICD-10 codes corresponding to malignant neoplasms (C00-C97). Patients with prior cancer history were excluded. The final study cohort includes patients who had been continuously enrolled in health insurance for at least 3 years (2002-2005) before cancer diagnosis to ensure comparability in comorbid conditions measurements across different lookback periods (1, 2, and 3 years).

Measurement
We identified the Charlson comorbid conditions using the ICD-10 diagnostic coding system proposed by Quan et al. [8] and applied the rule-out algorithm developed by Klabunde et al. [12]. Although Klabunde et al. used a 1-month washout window period to prevent the upcoding of outpatient claims [12], different washout window periods of 0, 30, and 90 days were additionally evaluated and compared in this study. We further considered various lookback periods (1, 2, and 3 years) and claim types (inpatient claims only, outpatient claims only, and either inpatient or outpatient claims) [see Additional file 1]. A total of 27 different comorbidity assessment methods, consisting of a combination of washout window periods, lookback period, and claims types, were compared to estimate comorbidity prevalence and its impact on non-cancer mortality.

Statistical analysis
We estimated non-cancer mortality using the Cox proportional hazards model considering left truncated and right-censored data. The hazard ratios (HR) and 95% confidence intervals (95% CIs) of non-cancer mortality  were estimated. Although cancer survival studies typically use the time since cancer diagnosis, this study used patient age as the timescale to describe the impacts of comorbidities on the non-cancer mortality risk considering a left truncated feature of the data. The left truncation occurs because patients entered the study at the time of cancer diagnosis, rather than at the start of the timeline (i.e., birth) [2]. The CCI was calculated by summing the weights of individual comorbidities derived by Charlson et al. in 1987 [6]. The scores were grouped into four categories: 0, 1-2 (mild), 3-4 (moderate), and ≥ 5 (severe); patients with a score of 0 were set as the reference group in the analysis [25]. Statistical analyses were performed using SAS version 9.4 (SAS Institute Inc., Cary, NC, USA) and RStudio version 1.0.136 (R Project for Statistical Computing, Vienna, Austria). P values < 0.05 were considered statistically significant.

Demographic characteristics of cancer patients in Korea
The demographic characteristics of 2979 cancer patients (50.8% men) diagnosed in 2006 are presented in Table 1. The patients' mean age was 57.4 (standard deviation, 15.4) years. Among male patients, the most prevalent malignancy was gastric cancer (21.7%), followed by colorectal (15.3%) and liver cancers (12.1%). Among female patients, 31.3% were diagnosed with sex-specific cancers (ICD-10: C50-C63), in which the detailed ICD codes were masked. During the study period, 41.6% of male and 26.6% of female patients had died.

Comorbidity prevalence determined using assessment methods
The comorbidity prevalence that resulted from using different methods is presented (Fig. 1, Table S1). The results show peptic ulcer disease (19.1%), chronic pulmonary disease (16.3%), and mild liver disease (9.5%) as the most common conditions affecting cancer patients in general (the prevalence rates were calculated based on either inpatient or outpatient claims, a 30-day washout window, and 2-year lookback period).
The comorbidity prevalence estimates based on a 2year lookback period and either inpatient or outpatient claims are presented with different washout window periods in Fig. 1a. The prevalence increased considerably when a washout window period was not used (No WP). However, the changes in prevalence from a 90-to 30-day washout window were relatively small compared to a 30-day washout window to No WP. The peptic ulcer prevalence increased by 8.3% if the washout window period was not considered (from 30-day to No WP: 19.1 to 27.4%). Whereas shortening the washout window period from 90-day to 30-day resulted in an increase of only 2.7% (from 90-day to 30-day: 16.4 to 19.1%).
The impact of the lookback period on the comorbidity prevalence estimates, together with a 30-day washout window based on either inpatient or outpatient claims, is demonstrated in Fig. 1b. Peptic ulcers' prevalence increased by up to 10.0% (from 1-year to 3-year: 12.9 to 22.9%). The difference in prevalence between 1-and 2year lookback was relatively large compared to the difference between 2-and 3-year lookback for all conditions, except congestive heart failure and rheumatic disease.
The majority of comorbidities were captured from the outpatient claims within a 30-day washout window and 2-year lookback. An analysis of inpatient claims revealed that only 2.8% of patients had peptic ulcer disease; in contrast, an analysis of either inpatient or outpatient claims revealed that 19.1% of patients had this disease (Fig. 1c). Furthermore, the prevalence of comorbidities observed in inpatient claims increased sharply when No WP was applied but not when a more extended lookback period was used. Specifically, the prevalence of peptic ulcer disease changed from 2.2% (90-day washout window) to 10.2% (No WP) when observed over a 2-year lookback period. However, an increase in lookback from 1 to 3 years resulted in only a 1.4% maximum difference [see Additional file 2].
According to different comorbidity assessment methods, changes in the number of comorbid conditions were compared (Fig. 2). With the most prolonged ascertainment period, at least one comorbidity was identified in 58.4% of the patients, whereas, in the analysis using the shortest ascertainment period, comorbidity was identified in only 4.5% of the patients. Although analyses using 30-day and 90-day washout window periods yielded relatively comparable estimates of the total number of conditions, a sharp increase was observed with No WP, mainly when the analysis only included inpatient claims. Figure 3 illustrates the differences in the distribution of claim types per comorbid condition. When inpatient claims were not used to measure comorbidity, 34.8 and 22.2% of patients with myocardial infarction and moderate or severe liver disease, respectively, were missed. In contrast, less than 5% of patients with chronic pulmonary disease (3.1%), rheumatic disease (4.9%), peptic ulcer disease (4.9%), and diabetes with chronic complications (4.4%) were missed when using outpatient claims only.

Impact of Charlson comorbidity on non-cancer mortality
The estimated HRs and impacts of each comorbid condition changed according to the use of different combinations of the washout window period, lookback period, and claim types [see Additional file 3]. In the analyses of either inpatient or outpatient claims with a 2-year lookback, the highest risk of non-cancer mortality was associated with moderate or severe liver disease. The HR increased from 5.5 to 9.9 as the washout window period increased. Myocardial infarction captured with No WP showed significant variations in HRs. However, the HRs associated with diabetes without chronic complications showed fewer variations.
The HRs for non-cancer mortality ranged from 1.0 (90-day washout window, 2-year lookback, and outpatient claim only) to 3.0 (90-day washout window, 1year lookback, and inpatient claims only) among cancer patients with CCI scores of 1-2 ( Table 2). Among those with CCI scores of 3-4, the HRs ranged from 1.0 (90day washout window, 2-year lookback, and outpatient claim only) to 5.0 (30-day washout window, 1-year lookback, and inpatient claims). For those with CCI scores of ≥5, the HRs ranged from 3.7 (No WP, 1-year lookback, and outpatient claims) to 8.0 (No WP, 3-year lookback, and inpatient claims). Using either inpatient or outpatient claims, the HRs decreased gradually as both the washout window and lookback period increased. However, the analysis based on inpatient claims only increased HRs associated with CCI score of 1-2 (mild)

Discussion
The CCI has been used to measure comorbidity and adjust for associated risks in survival models based on the various data sources, including clinical trials, prospective and retrospective cohort studies, and claims data [26][27][28][29][30][31][32][33]. Recently, population-based health care claims data have been used more frequently in studies of healthrelated outcomes, as these data could yield generalizable results. However, it remained unclear how comorbid conditions in cancer patients can be measured and accounted for modelling non-cancer mortality based on claims data. This study is the first to systematically compare various comorbidity assessment methods and evaluate their impact on non-cancer mortality risk estimates using cancer patients' health care claims data. The effects of different washout window periods, lookback periods, and claim types on comorbidity prevalence and associated non-cancer mortality risk were presented. In the absence of a washout window period, a substantial increase in the prevalence of comorbidities was observed, highlighting the critical role of the washout window period in preventing up-coding claims. Cancer patients may frequently visit the hospital right before a cancer diagnosis, and some diagnostic codes applied to the medical examination may be recorded for administrative purposes. Likewise, complications related to cancer and its treatment should be differentiated from comorbidities, as these do not represent the patient's general health before a cancer diagnosis.
Regarding the lookback period, 1 year might be insufficient to account for rare comorbid conditions. In contrast, a 3-year lookback might conservatively capture comorbidities. The comorbidities captured in inpatient claims may represent long periods of hospitalization hence associated higher risk of non-cancer mortality. Such differentiation in the analytical approach might better account for disease severity.
This study assessed the impact of comorbidity on the estimates of non-cancer mortality. The cancer-specific mortality risk could not be evaluated because the NHIS-NSC database lacks information about the cancer stage, which has been shown to affect the cancer mortality risk strongly. Analyses that account for the cancer stage to clarify the association between comorbidity and cancerspecific mortality [3] risk remain future studies. Nevertheless, previous studies have shown that the number and severity of comorbidities strongly influence noncancer mortality risk, with a relatively lesser effect on cancer-specific mortality [2,4,34]. The NHIS data confidentially policy masked some comorbid conditions, including dementia and AIDS/HIV. Therefore, the present study could not evaluate the impacts of these conditions on non-cancer mortality risk, which remained a limitation.

Conclusions
The study findings suggest that the estimates of comorbidity prevalence and its impact on non-cancer mortality risk vary considerably depending on the assessment method used. These discrepancies demonstrate that selecting an optimal approach is critical to an accurate prognostication of cancer patients' mortality. Researchers should understand the complexity of comorbidity assessments using claims data and select the assessment method with caution.