Skip to main content

Suitability of administrative claims databases for bariatric surgery research – is the glass half-full or half-empty?

Abstract

Background

Claims databases are generally considered inadequate for obesity research due to suboptimal capture of body mass index (BMI) measurements. This might not be true for bariatric surgery because of reimbursement requirements and changes in coding systems. We assessed the availability and validity of claims-based weight-related diagnosis codes among bariatric surgery patients.

Methods

We identified three nested retrospective cohorts of adult bariatric surgery patients who underwent adjusted gastric banding, Roux-en-Y gastric bypass, or sleeve gastrectomy between January 1, 2011 and June 30, 2018 using different components of OptumLabs® Data Warehouse, which contains linked de-identified claims and electronic health records (EHRs). We measured the availability of claims-based weight-related diagnosis codes in the 6-month preoperative and 1-year postoperative periods in the main cohort identified in the claims data. We created two claims-based algorithms to classify the presence of severe obesity (a commonly used cohort selection criterion) and categorize BMI (a commonly used baseline confounder or postoperative outcome). We evaluated their performance by estimating sensitivity, specificity, positive predictive value, negative predictive value, and weighted kappa in two sub-cohorts using EHR-based BMI measurements as the reference.

Results

Among the 29,357 eligible patients identified using claims only, 28,828 (98.2%) had preoperative weight-related diagnosis codes, either granular indicating BMI ranges or nonspecific denoting obesity status. Among the 27,407 patients with granular preoperative codes, 12,346 (45.0%) had granular codes and 9355 (34.1%) had nonspecific codes in the 1-year postoperative period. Among the 3045 patients with both preoperative claims-based diagnosis codes and EHR-based BMI measurements, the severe obesity classification algorithm had a sensitivity 100%, specificity 71%, positive predictive value 100%, and negative predictive value 78%. The BMI categorization algorithm had good validity categorizing the last available preoperative or postoperative BMI measurements (weighted kappa [95% confidence interval]: preoperative 0.78, [0.76, 0.79]; postoperative 0.84, [0.80, 0.87]).

Conclusions

Claims-based weight-related diagnosis codes had excellent validity before and after bariatric surgical operation but suboptimal availability after operation. Claims databases can be used for bariatric surgery studies of non-weight-related effectiveness and safety outcomes that are well-captured.

Peer Review reports

Background

Bariatric surgery is the most effective treatment for severe obesity, a risk factor for many health conditions including cardiovascular diseases and death [1]. Patients who undergo bariatric surgery can achieve effective weight loss and remission of many comorbidities [2, 3]. However, between 2011 and 2018, only 1% of adults with severe obesity in the United States received bariatric surgery in a given year [4, 5]. With the persistent increase in the prevalence of obesity and considerable shift in the type of bariatric surgical operations performed over the last decade [5], it is important to evaluate the long-term comparative effectiveness and safety of different operations.

Administrative claims databases are an important real-world data source in comparative effectiveness and safety research. These databases often provide large and demographically diverse study populations at a fraction of the cost compared to other data sources [6]. Claims databases also capture most, if not all, medically attended events including hospitalizations and procedures performed. However, claims databases are generally considered inadequate for obesity-related research due to the lack of body mass index (BMI) measurements and the underuse and poor validity of weight-related diagnosis codes [7,8,9,10]. This limitation may not necessarily apply to bariatric surgery research because most health insurers in the United States require surgical facilities to receive approval to perform a given bariatric operation (a.k.a., “prior authorization”). This process involves documentation of eligibility, including having a BMI measurement ≥40 kg/m2, or a BMI measurement ≥35 kg/m2 with at least 1 obesity-related co-morbidity, which are typically converted into diagnosis codes in the patient’s medical record and reimbursement claims [11,12,13]. In addition, the specific International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) weight-related diagnosis codes denoting BMI ranges became available in 2006, with a subset of diagnosis codes indicating BMI ≥40 kg/m2 becoming effective in January 2011. The more granular ICD-10-CM codes became effective in October 2015. These coding changes and the prior authorization requirements may considerably improve the availability and validity of weight-related diagnosis codes in claims databases among bariatric surgery patients.

In this study, we evaluated the availability and validity of weight-related diagnosis codes before and after bariatric surgical operations in a large claims database linked to an electronic health record (EHR) database with actual BMI measurements.

Methods

Data source

This study used data from the OptumLabs® Data Warehouse (OLDW), which contains linked de-identified administrative claims data for commercially insured and Medicare Advantage enrollees, and de-identified EHR data that has been normalized and standardized into a single database. As of May 2019, the database contains longitudinal health information on over 200 million lives, 137 million in claims, 88 million in the EHR, and 26 million in the linked component since 2007 from a diverse mixture of ages, ethnicities, and geographical regions across the United States [14]. The claims data component includes physician, pharmacy, and facility claims submitted for reimbursement for covered members. Both paid and denied claims are included in the database and analysis, except for pharmacy claims where only paid claims are included in the analysis. The EHR component includes clinical diagnoses, procedures, prescriptions, clinical notes, laboratory results, and vital signs (including BMI) recorded as part of routine clinical practice.

Study populations

We created 3 nested study cohorts using different components of OLDW to evaluate the availability (Cohort 1) and validity (Cohorts 2 and 3) of claims-based weight-related diagnosis codes before and after the bariatric surgical operation (Additional file 1 eFigure 1). The study was approved by the Harvard Pilgrim Health Care institutional review board with an exemption and waiver of individual patient consent.

Cohort 1

Using the claims data, we identified a retrospective cohort of patients aged 18 years or older who underwent adjusted gastric banding (AGB), Roux-en-Y gastric bypass (RYGB), or sleeve gastrectomy (SG) between January 1, 2011 and June 30, 2018. Eligible patients had continuous health plan enrollment with medical and pharmacy benefits during the 6-month period preceding the index bariatric operation, which could occur in an inpatient or ambulatory care setting. To minimize the inclusion of patients with non-obesity indications, we excluded patients who had any major bariatric operation, revisional procedures, or gastrointestinal malignancy in the 6-month preoperative period, as well as patients who had an emergency department encounter or a diagnosis of gastrointestinal ulcers on the day of the index operation. We further excluded patients who had multiple conflicting bariatric operation procedure codes on the day of index operation. The cohort was identified using ICD-9-CM (prior to October 1, 2015) and ICD-10-CM (on or after October 1, 2015) diagnosis and procedure codes; Current Procedural Terminology, Fourth Edition (CPT-4®); and the Healthcare Common Procedure Coding System. We used this cohort to evaluate the availability of claims-based weight-related diagnosis codes before and after the bariatric operation.

Cohorts 2 and 3

Cohort 2 consisted of the subset of patients in Cohort 1 who had ≥1 preoperative claims-based weight-related diagnosis code with the last available code being granular (e.g., V85.30 or Z68.30 indicating BMI between 30.0–30.9 kg/m2) and ≥ 1 EHR-based BMI measurement recorded ±30 days of the granular code during the 6-month preoperative period (including the index operation day). We used this cohort to evaluate the performance of our claims-based severe obesity and BMI categorization algorithms (defined below) in the preoperative period. Cohort 3 consisted of the subset of patients in Cohort 2 whose last available claims-based postoperative weight-related diagnosis was a granular code with ≥1 EHR-based BMI measurement recorded ±30 days of this diagnosis code during the 1-year postoperative period. We used Cohort 3 to evaluate the performance of our claims-based algorithms in the postoperative period.

Development of claims-based algorithms for severe obesity and BMI categorization

We created 2 claims-based algorithms using weight-related diagnosis codes (Additional file 1 eTable 1): a severe obesity classification algorithm and a BMI categorization algorithm. The severe obesity classification algorithm classified patients as having “severe obesity” if they had ≥1 claims-based weight-related diagnosis code indicating BMI ≥35 kg/m2 any time during the 6-month preoperative period. In bariatric surgery research, this algorithm can be used as an important cohort selection criterion to identify patients with severe obesity as the treatment indication.

The BMI categorization algorithm classified a patient’s BMI into 1 of the 10 levels as indicated by their last available weight-related diagnosis codes separately during the 6-month preoperative and 1-year postoperative periods (BMI levels, kg/m2: ≤19.9, 20.0–24.9, 25.0–29.9, 30.0–34.9, 35.0–39.9, 40.0–44.9, 45.0–49.9, 50.0–59.9, 60.0–69.9, and ≥ 70.0). This algorithm can be used to measure the last available preoperative BMI, which is an important covariate for comparative effectiveness research on bariatric surgery as preoperative BMI may be associated both with operation choice and risks of many health outcomes. The algorithm can also measure the last available BMI measurement within a defined postoperative follow-up period (e.g., 1 year in this study) for weight-related outcome assessment.

Validation of claims-based algorithms for severe obesity and BMI categorization

We used the EHR-based BMI measurements recorded during an encounter to validate the claims-based algorithms. We classified patients as having severe obesity if they had ≥1 EHR-based BMI measurements ≥35 kg/m2 any time during the 6-month preoperative period. For BMI categorization, we classified a patient’s most proximate EHR-based BMI measurement recorded ±30 days of the last available claims-based diagnosis code in the 6-month preoperative period (for preoperative analyses) and the last available EHR-based BMI measurement in the 1-year postoperative period (for postoperative analyses), separately, into 1 of the 10 levels described above.

Statistical analyses

Availability and predictors of weight-related diagnosis codes during the preoperative and postoperative periods

We described the presence of weight-related ICD-9-CM and ICD-10-CM diagnosis codes occurring any time in the 6-month preoperative period and the 1-year postoperative period, separately, in Cohort 1. We also performed the analysis by operation type, calendar year, and coding era (before October 1, 2015 for the ICD-9-CM era; October 1, 2015 and later for the ICD-10-CM era). We assessed factors associated with the presence of preoperative and postoperative claims-based weight-related diagnosis codes, separately, using logistic regression models. Factors selected a priori included demographic characteristics, region of residence, calendar year, coding era, type of index bariatric operation, care setting of index operation, and medical history measured in the 6-month preoperative period (including the Charlson-Elixhauser comorbidity index score [15], individual comorbid conditions, and prior hospital admissions). The Charlson-Elixhauser comorbidity index score was originally developed to predict mortality risk in older patients [15]; we used the score as a proxy for general health status.

Performance of the severe obesity classification algorithm during the preoperative period

We assessed the performance of the severe obesity classification algorithm using sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) within Cohort 2. The sensitivity was calculated as the proportion of patients accurately classified as having severe obesity based on claims-based diagnosis code (i.e., true positives) among those classified as such based on their EHR-based BMI measurement. The specificity was calculated as the proportion of patients accurately classified as not having severe obesity based on claims-based diagnosis codes (i.e., true negatives) among those whose EHR-based BMI measurement indicated as such. The PPV was calculated as the proportion of true positives among patients classified as having severe obesity based on their claims-based diagnosis code. The NPV was calculated as the proportion of true negatives among patients classified as not having severe obesity based on diagnosis code.

Performance of the BMI categorization algorithm during the preoperative and postoperative periods

We evaluated the performance of the BMI categorization algorithm separately in the 6-month preoperative period using Cohort 2 and in the 1-year postoperative period using Cohort 3. In both preoperative and postoperative periods, we assessed the concordance between the last available claims-based weight-related diagnosis code and its most proximate EHR-based BMI measurement recorded ±30 days of the claims-based diagnosis code by estimating the weighted Cohen’s kappa. As a variation of the Cohen’s kappa, a measure of the degree of agreement, the weighted kappa assigns weights for partial agreement according to their distance from the perfect agreement [16]. The weighted kappa ranges from − 1 to 1 with negative values possible but unlikely in practice. In general, kappa values >0.75 are considered excellent, 0.45–0.75 are considered fair to good, and <0.40 are considered poor agreement [17]. In both preoperative and postoperative periods, we also estimated the sensitivity, specificity, PPV, and NPV within each level of the algorithm.

Sensitivity analyses

We examined a different severe obesity classification algorithm using BMI ≥40 kg/m2 as the cutoff. We also varied the BMI categorization algorithm by (1) using larger BMI intervals (5-level BMI categories, kg/m2: ≤29.9, 30.0–39.9, 40.0–49.9, 50.0–59.9, ≥60.0; 4-level categories: underweight ≤19.9, normal 20.0–24.9, overweight 25.0–29.9, obese ≥30.0), and (2) adding nonspecific weight-related diagnosis codes (e.g., 278.00/E66.9 [unspecific obesity], 278.01/E66.01 [morbid obesity], 278.03/E66.2 [obesity hypoventilation syndrome], E66.09 [other obesity due to excess calories], E66.1 [drug-induced obesity], and E66.8 [other obesity] for obese) and assessed their performance during the preoperative and postoperative periods (Additional file 1 eTable 2). In addition, we examined the impact of the proximity restriction between the claims-based weight-related diagnosis code and the EHR-based BMI measurement on their concordance in the preoperative and postoperative periods. We also separately evaluated the performance of the BMI categorization algorithm for the last available BMI during the 6-month and 2-year postoperative periods. We performed all analyses with SAS Enterprise Guide 7.13 for Windows (SAS Institute, Cary, NC).

Results

Population characteristics

Cohort 1 included 29,357 patients, with 2941 (10.0%) having AGB, 9445 (32.2%) having RYGB, and 16,971 (57.8%) having SG. Table 1 shows their baseline characteristics. The population was largely female (75.5%) and white (67.0%) with a mean age of 47.0 years. The most prevalent comorbid conditions were hypertension (68.8%), gastroesophageal reflux disease (62.7%), and dyslipidemia (55.3%).

Table 1 Baseline characteristics of 29,357 patients who received a bariatric surgical operation, 2011–2018 (Cohort 1)

Cohort 2 included 3045 patients from Cohort 1 who had both claim-based weight-related diagnosis codes (with the last preoperative code being granular) and EHR-based BMI measurements in the 6-month preoperative period; 196 (6.4%) had AGB, 1251 (34.6%) had RYGB, and 1794 (58.9%) had SG. Compared to Cohort 1, the average age was slightly higher (47.6 years), and slightly more patients had hypertension (69.5%) and dyslipidemia (56.4%) in Cohort 2. On the index operation day, 77.6% had both claims-based diagnosis codes and EHR-based BMI measurements.

Cohort 3 included 511 patients from Cohort 2 who had granular last available claims-based weight-related diagnosis codes in the 1-year postoperative period with ≥1 EHR-based BMI measurement in the ±30 days of the diagnosis code, with 31 (6.1%) having AGB, 190 (37.2%) having RYGB, and 290 (56.8%) having SG. Compared to Cohorts 1 and 2, the average age was higher (48.9 years) in Cohort 3, more patients had hypertension (71.8%) and dyslipidemia (58.1%), and fewer had non-alcoholic fatty liver disease (22.5%) or diagnosis codes indicating smoking (1.8%). On average, patients had their first weight-related diagnosis code around 57 days after index operation and last available diagnosis code 159 days before the end of 1-year follow-up.

Presence of weight-related diagnosis codes

6-month preoperative period

Most of the patients in Cohort 1 had ≥1 claims-based weight-related diagnosis code, with 27,407 (93.4%) having granular codes, 1421 (4.8%) having nonspecific codes, and 529 (1.8%) having none. The prevalence of patients without a weight-related diagnosis code decreased from 3.4% in 2011 to 1.6% in 2018, while the presence of granular codes increased from 86.5% in 2011 to 97.1% in 2018 (Figure 1). The granular diagnosis codes were more prevalent in the ICD-10-CM era than the ICD-9-CM era (96.8% versus 91.1%). Similar increasing trends were observed across operation types, with higher prevalence of granular diagnosis codes observed in SG patients (Additional file 1 eFigures 2 & 3).

Fig. 1
figure1

Presence of claims-based weight-related ICD-9-CM or ICD-10-CM diagnosis codes during the 6-month preoperative period for bariatric surgery patients in 2011–2018

1-year postoperative period

Among the 27,407 patients with granular weight-related diagnosis codes in the 6-month preoperative period in Cohort 1, 12,346 (45.0%) had granular codes, 9355 (34.1%) had nonspecific codes, and 5706 (20.8%) did not have any codes in the first postoperative year (Fig. 2). The distribution of diagnosis codes was similar among patients receiving different types of operation.

Fig. 2
figure2

Presence of claims-based weight-related ICD-9-CM or ICD-10-CM diagnosis codes during the first postoperative year. Left panel: among all patients who underwent one of the three main bariatric surgical operations in 2011–2018; Middle panel: among patients who had weight-related diagnosis codes during the 6-month preoperative period; Right panel: among patients who had granular weight-related diagnosis codes during the 6-month preoperative period. Granular codes are diagnosis codes denoting narrow body mass index (BMI) ranges (e.g., V85.30 or Z68.30 indicating BMI between 30.0 and 30.9 kg/m2); Nonspecific codes are diagnosis codes denoting broad BMI ranges or obesity status

Factors associated with the presence of weight-related diagnosis codes

6-month preoperative period

Compared to patients with claims-based weight-related diagnosis codes, those without codes were more likely to be male, Asian, older, have more hospital stays before operation, or receive the operation in an ambulatory care setting in Cohort 1 (Table 2). Among patients who had weight-related diagnosis codes, those with granular codes (e.g., V85.30) were more likely to have SG, be covered by Medicare Advantage plans, or have the operation in an inpatient setting or recent years (Additional file 1 eTable 3).

Table 2 Determinants of missing weight-related diagnosis codes during the 6-month preoperative period, 2011–2018 (Cohort 1)

1-year postoperative period

Compared to patients with claims-based weight-related diagnosis codes, those without codes were more likely to receive AGB, be younger, be male, be commercially insured, or lack preoperative weight-related diagnosis codes in Cohort 1 (Additional file 1 eTable 4). Among patients who had weight-related diagnosis codes in the postoperative year, those having granular codes were more likely to be older, be covered by Medicare Advantage plans, have comorbid conditions, receive SG, or have the operation in an inpatient setting or recent years (Additional file 1 eTable 5).

Performance of the claims-based algorithms

6-month preoperative period

In Cohort 2, the severe obesity classification algorithm (i.e., presence of BMI ≥35 kg/m2) in the 6-month preoperative period had a sensitivity of 100%, a specificity of 71%, a PPV of 100%, and an NPV of 78% (Additional file 1 eTable 6). When classifying the last available preoperative weight-related diagnosis code into 10 levels, the BMI categorization algorithm had a weighted kappa of 0.78 (95% confidence interval 0.76, 0.79). The specificity and NPV were high for all BMI levels; The sensitivity and PPV were above 60% for most BMI levels over 35 kg/m2 (e.g., BMI 35.0–39.9, sensitivity 64%, specificity 97%, PPV 81%, NPV 93%; 40.0–44.9, sensitivity 76%, specificity 87%, PPV 71%, NPV 90%) and lowest for BMI between 30.0 and 34.9 kg/m2 (sensitivity 30%) (Table 3).

Table 3 Validation results for the BMI categorization algorithm in the 6-month preoperative (Cohort 2) and 1-year postoperative periods (Cohort 3)

1-year postoperative period

In Cohort 3, the BMI categorization algorithm had a weighted kappa of 0.84 (95% confidence interval 0.80, 0.87). The specificity and NPV were high for all BMI levels while the sensitivity was above 70% and the PPV was above 60% for most BMI levels (Table 3).

Sensitivity analyses

When varying the severe obesity classification algorithm to detect the presence of BMI ≥40 kg/m2 during the 6-month preoperative period, both the specificity and NPV increased (75 and 83%, respectively) while sensitivity and PPV dropped slightly (98 and 96%, respectively). Expanding the algorithms to include nonspecific weight-related diagnosis codes (e.g., 278.01) resulted in meaningful decrease in specificity (Additional file 1 eTable 6).

The 5-level BMI categorization algorithm had similar concordance compared to the 10-level categorization, while the 4-level BMI categorization algorithm had great concordance with a weighted kappa above 0.90 for both the preoperative and postoperative periods (Table 3). Expanding the algorithms to include nonspecific weight-related diagnosis codes had minimal impact on their performance (Additional file 1 eTable 7). Relaxing the proximity requirement between the timing of the claims-based weight-related diagnosis codes and the EHR-based BMI measurements increased the size of the validation sample; this did not change their concordance during the 6-month preoperative period but reduced their concordance in the 1-year postoperative period (Additional file 1 eFigure 4). The BMI categorization algorithm for the last available BMI performed well in the 6-month and 2-year postoperative periods (Additional file 1 eTable 8).

Discussion

In a large administrative claims database, we found that nearly all bariatric surgery patients had preoperative weight-related diagnosis codes, while the presence of granular weight-related diagnosis codes increased substantially in both the preoperative and postoperative periods between 2011 and 2018. The claim-based algorithm for severe obesity, which classified patients as having severe obesity if they had a diagnosis code indicating BMI ≥35 kg/m2, had high sensitivity and PPV but reasonable specificity and NPV. The BMI categorization algorithm that categorized weight-related diagnosis codes into BMI levels had excellent concordance with the EHR-based BMI measurement, with high specificity, PPV, and NPV across all levels and higher sensitivity among higher levels of BMI.

The persistently high prevalence of claims-based weight-related diagnosis codes, including granular and nonspecific codes, in the preoperative period across the study years reflects the high adherence to the insurance reimbursement requirement [11,12,13]. The observed higher prevalence of weight-related diagnosis codes in the ICD-10-CM era than the ICD-9-CM era is consistent with previous data that focused on the claim-based diagnosis codes in the general population [10].

The BMI categorization algorithm had different sensitivities for BMI level 30.0–34.9 kg/m2 in the preoperative and postoperative periods (30% versus 84%). Six months before having a bariatric operation, 70% of patients with an EHR-based BMI measurement between 30.0 and 34.9 kg/m2 had a granular weight-related diagnosis code indicating BMI ≥35 kg/m2. During the first postoperative year, only 15% of those with an BMI measurement between 30.0 and 34.9 kg/m2 had a diagnosis code indicating BMI ≥35 kg/m2. These patients with borderline BMI levels immediately before having a bariatric operation might have undergone preoperative weight loss as required by their insurance or encouraged by their clinical programs, as half of them had 1 or more BMI measurements ≥35 kg/m2 within the prior 30 days. These patients might also have been up-coded with a higher weight-related diagnosis code to meet the prior authorization requirement.

Claims databases for bariatric Surgery research: a glass half-full of half-empty?

The high prevalence and validity of weight-related diagnosis codes before a bariatric operation in claims databases makes it feasible to use these codes to capture a large proportion of eligible patients, especially when researchers impose additional eligibility criteria to exclude patients with non-obesity indications, like what we did in our study. In addition, the high concordance between the claims-based BMI categorization algorithm and actual BMI measurement, along with its high validity, suggests that it is possible to use these preoperative weight-related diagnosis codes for baseline confounding control.

On the other hand, despite considerable increase across years and high validity, the presence of weight-related diagnosis codes remained low in the first postoperative year, with around 80% of patients having any codes and around 60% having granular codes in 2017 and 2018. The suboptimal presence of weight-related diagnosis codes in the postoperative period makes it more challenging to use claims databases for weight-related effectiveness research. In addition, there could be differential coding in the postoperative period because patients with granular weight-related diagnosis codes were older and had more comorbid conditions (Additional file 1 eTable 5). These patients with granular diagnosis codes in the postoperative period may not be representative of the overall study population. For example, some of them may be preparing for a second stage operation or having inadequate weight loss from their index operation. It is thus important to weigh the internal validity and generalizability when using the postoperative weight-related diagnosis codes for weight-related effectiveness outcome research. In situations when all relevant factors contributing to the presence of postoperative granular diagnosis codes are measured, results from patients with granular codes could be generalized to the overall study population using appropriate statistical approaches, such as inverse probability weighting [18]. Taken together, our findings support the use of administrative claims data for bariatric surgery research of non-weight-related outcomes that are generally well-captured, such as rehospitalization, reoperation, venous thromboembolism, or remission of certain comorbidities including type 2 diabetes [19,20,21,22].

Strengths and limitations

This study used contemporary data from a large administrative claim database linked with EHR to validate two claims-based weight-related algorithms. Prior studies focused on either claims-based algorithms in the general population [8, 10] or the broad four-level obesity classification algorithm for bariatric surgery patients in the preoperative period [23]. We evaluated the validity of these diagnosis codes during both the preoperative and postoperative periods, providing information for researchers who are interested in using administrative claims databases to study weight-related effectiveness outcomes. Our findings add to the knowledge base of the quality and suitability of administrative claims data, a real-world data source, for generation of real-world evidence in bariatric surgery research [24].

One limitation of our study is the small sample size for the postoperative period resulted from the proximity requirement on the EHR-based BMI measurement, which may limit the generalizability of our results. In sensitivity analyses where we relaxed the proximity requirement, the size of the validation sample increased but no substantial change was observed in the validity of postoperative weight-related diagnosis codes. Moreover, the linked EHR data were only available on a small subset of patients identified in claims who received care at healthcare service systems that contribute EHR data to OLDW, raising the possibility of unmeasured factors affecting our analyses and limiting the generalizability of our results.

Conclusions

Among bariatric surgery patients identified within administrative claims databases, the validity of weight-related diagnosis codes was excellent during the preoperative and postoperative periods. These findings support the use of administrative claims databases for bariatric surgery research in the absence of BMI measurements for non-weight-related effectiveness and safety outcomes that are generally well-captured in these databases. However, the availability of weight-related diagnosis codes was suboptimal during the postoperative period, making it more challenging to use claims databases for weight-related effectiveness research.

Availability of data and materials

The dataset from this study is held securely in a virtual workspace at OptumLabs. Data sharing agreements prohibit the authors from making the dataset publicly available. Access may be granted to those who meet pre-specified criteria for confidential access upon mutual agreements with OptumLabs. The analytic codes are available from the authors upon request, understanding that the programs may rely upon coding templates or macros that are unique to the data environment.

Abbreviations

AGB:

Adjusted gastric banding

BMI:

Body mass index

CI:

Confidence interval

CPT-4®:

Current Procedural Terminology, Fourth Edition

DVT:

Deep vein thrombosis

EHR:

Electronic health records

GERD:

Gastroesophageal reflux disease

ICD-9-CM:

International Classification of Diseases, Ninth Revision, Clinical Modification

ICD-10-CM:

International Classification of Diseases, Tenth Revision, Clinical Modification

IQR:

Interquartile range

NAFLD:

Non-alcoholic fatty liver disease

NPV:

Negative predictive value

OLDW:

OptumLabs® Data Warehouse

PCOS:

Polycystic ovarian syndrome

PE:

Pulmonary embolism

PPV:

Positive predictive value

RYGB:

Roux-en-Y gastric bypass

SD:

Standard deviation

SG:

Sleeve gastrectomy

References

  1. 1.

    Must A, Spadano J, Coakley EH, et al. The disease burden associated with overweight and obesity. JAMA. 1999;282(16):1523–9.

    CAS  Article  Google Scholar 

  2. 2.

    Maciejewski ML, Arterburn DE, Van Scoyoc L, et al. Bariatric surgery and long-term durability of weight loss. JAMA Surg. 2016;151(11):1046–55.

    Article  Google Scholar 

  3. 3.

    Smith BR, Schauer P, Nguyen NT. Surgical approaches to the treatment of obesity: bariatric surgery. Med Clin North Am. 2011;95(5):1009–30.

    Article  Google Scholar 

  4. 4.

    Hales CM, Fryar CD, Carroll MD, et al. Trends in obesity and severe obesity prevalence in US youth and adults by sex and age, 2007-2008 to 2015-2016. JAMA. 2018;319(16):1723–5.

    Article  Google Scholar 

  5. 5.

    English WJ, DeMaria EJ, Hutter MM, et al. American Society for Metabolic and Bariatric Surgery 2018 estimate of metabolic and bariatric procedures performed in the United States. Surg Obes Relat Dis. 2020;16(4):457–63.

    Article  Google Scholar 

  6. 6.

    Schneeweiss S, Avorn J. A review of uses of health care utilization databases for epidemiologic research on therapeutics. J Clin Epidemiol. 2005;58(4):323–37.

    Article  Google Scholar 

  7. 7.

    Martin BJ, Chen G, Graham M, Quan H. Coding of obesity in administrative hospital discharge abstract data: accuracy and impact for future research studies. BMC Health Serv Res. 2014;14:70.

    Article  Google Scholar 

  8. 8.

    Lloyd JT, Blackwell SA, Wei II, et al. Validity of a claims-based diagnosis of obesity among Medicare beneficiaries. Eval Health Prof. 2015;38(4):508–17.

    Article  Google Scholar 

  9. 9.

    Peng M, Southern DA, Williamson T, Quan H. Under-coding of secondary conditions in coded hospital health data: impact of co-existing conditions, death status and number of codes in a record. Health Informatics J. 2017;23(4):260–7.

    Article  Google Scholar 

  10. 10.

    Ammann EM, Kalsekar I, Yoo A, Johnston SS. Validation of body mass index (BMI)-related ICD-9-CM and ICD-10-CM administrative diagnosis codes recorded in US claims data. Pharmacoepidemiol Drug Safety. 2018;27(10):1092–100.

    Article  Google Scholar 

  11. 11.

    Chapter 32; 150 Billing Requirements for Bariatric Surgery for Treatment of Morbid Obesity. Centers for Medicare & Medicaid Services. 2019. https://www.cms.gov/Regulations-and-Guidance/Guidance/Manuals/Downloads/clm104c32.pdf. Accessed 29 May 2019.

  12. 12.

    Bariatric Surgery for Treatment of Co-Morbid Conditions Related to Morbid Obesity (NCD 100.1) UnitedHealthcare® Medicare Advantage Policy Guideline. 2020. https://www.uhcprovider.com/content/dam/provider/docs/public/policies/medadv-guidelines/b/bariatric-surgery-treatment-morbid-obesity.pdf. Accessed 20 Mar 2020.

  13. 13.

    Bariatric Surgery. UnitedHealthcare® commercial medical policy. 2020. https://www.uhcprovider.com/content/dam/provider/docs/public/policies/comm-medical-drug/bariatric-surgery.pdf. Accessed 7 Aug 2020.

    Google Scholar 

  14. 14.

    OptumLabs. OptumLabs and OptumLabs Data Warehouse (OLDW) Descriptions and Citation. Cambridge, MA: n.p., May 2019. PDF. Reproduced with permission from OptumLabs.

  15. 15.

    Gagne JJ, Glynn RJ, Avorn J, et al. A combined comorbidity score predicted mortality in elderly patients better than existing scores. J Clin Epidemiol. 2011;64(7):749–59.

    Article  Google Scholar 

  16. 16.

    Cohen J. Weighted kappa: nominal scale agreement with provision for scaled disagreement or partial credit. Psychol Bull. 1968;70(4):213–20.

    CAS  Article  Google Scholar 

  17. 17.

    Fleiss JL. Statistical methods for rates and proportions. 2nd ed. New York: Wiley; 1981.

    Google Scholar 

  18. 18.

    Cole SR, Stuart EA. Generalizing evidence from randomized clinical trials to target populations: the ACTG 320 trial. Am J Epidemiol. 2010;172(1):107–15.

    Article  Google Scholar 

  19. 19.

    Arterburn D, Wellman R, Emiliano A, et al. Comparative effectiveness and safety of bariatric procedures for weight loss: a PCORnet cohort study. Ann Intern Med. 2018;169(11):741–50.

    Article  Google Scholar 

  20. 20.

    Panagiotou OA, Markozannes G, Adam GP, et al. Comparative effectiveness and safety of bariatric procedures in Medicare-eligible patients: a systematic review. JAMA Surgery. 2018;153(11):e183326.

    Article  Google Scholar 

  21. 21.

    Fisher DP, Johnson E, Haneuse S, et al. Association between bariatric surgery and macrovascular disease outcomes in patients with type 2 diabetes and severe obesity. JAMA. 2018;320(15):1570–82.

    Article  Google Scholar 

  22. 22.

    Sjöström L, Peltonen M, Jacobson P, et al. Association of bariatric surgery with long-term remission of type 2 diabetes and with microvascular and macrovascular complications. JAMA. 2014;311(22):2297–304.

    Article  Google Scholar 

  23. 23.

    Ammann EM, Kalsekar I, Yoo A, et al. Assessment of obesity prevalence and validity of obesity diagnoses coded in claims data for selected surgical populations: a retrospective, observational study. Medicine. 2019;98(29):e16438.

    Article  Google Scholar 

  24. 24.

    Framework for FDA’s Real-world Evidence Program. U.S. Food & Drug Administration. 2018. https://www.fda.gov/media/120060/download. Assessed 10 Jun 2019.

Download references

Acknowledgements

The authors wish to thank Drs. Qoua Her, Di Shu, Rui Wang, Jenna Wong and Ms. Mia Gallagher, for their helpful comments for this study.

Funding

This work was supported by a Department of Population Medicine faculty grant. Dr. Toh was partially supported by the National Institutes of Health [grant number U01EB023683] and Agency for Healthcare Research and Quality [grant number R01HS026214]. Dr. Lewis was supported in part by the National Institutes of Health [grant number R01DK112750]. None of the funding bodies played a role in the design of the study or collection, analysis, or interpretation of data, or in writing the manuscript.

Author information

Affiliations

Authors

Contributions

XL conceived the study. XL and ST designed the study. KHL, KC, and JFW commented on the analytical plan and development of claims-based algorithms. XL extracted and analyzed data. All authors interpreted data as a team and contributed to the development of this manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Xiaojuan Li.

Ethics declarations

Ethics approval and consent to participate

The study was approved by the Harvard Pilgrim Health Care institutional review board with an exemption and waiver of individual patient consent.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Additional file 1: eFigure 1.

Identification of eligible bariatric surgery patients in 3 nested cohorts for evaluation of availability and validity of weight-related diagnosis codes in both the preoperative and postoperative periods. eFigure 2. Presence of weight-related ICD-9-CM/ICD-10-CM diagnosis codes during the 6-month preoperative period for patients who underwent 1 of the 3 bariatric surgical operations in 2011–2018. eFigure 3. Presence of weight-related ICD-9-CM/ICD-10-CM diagnosis codes during the 6-month preoperative period for patients who underwent 1 of the 3 bariatric surgical operations in the ICD-9-CM and ICD-10-CM coding era between 2011 and 2018. eTable 1. Definition of the claims-based algorithms to classify morbid obesity and categorize the body mass index using ICD-9-CM and ICD-10-CM diagnosis codes. eTable 2. Variation of the claims-based algorithms to classify morbid obesity and categorize the body mass index using ICD-9-CM and ICD-10-CM diagnosis codes. eTable 3. Determinants of having granular weight-related diagnosis codes during the 6-month preoperative period in bariatric surgery patients, 2011–2018. eTable 4. Determinants of missing weight-related diagnosis codes in the first postoperative year in bariatric surgery patients, 2011–2018. eTable 5. Determinants of having granular weight-related diagnosis codes in the first postoperative year in bariatric surgery patients, 2011–2018. eTable 6. Performance of the modified claims-based severe obesity classification algorithm in the 6-month preoperative period (Cohort 2). eTable 7. Performance of the modified claims-based body mass index categorization algorithm in the 6-month preoperative (Cohort 2) and 1-year postoperative periods (Cohort 3). eFigure 4. The sample size and estimated weighted kappa when relaxing the proximity restriction between the weight-related diagnosis codes and the body mass index (BMI) measurement in the electronic health records for categorization of the last available BMI in the first postoperative year with the BMI categorization algorithm. eTable 8. Performance of the claims-based body mass index (BMI) categorization algorithm for the last available BMI in different postoperative periods.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Li, X., Lewis, K.H., Callaway, K. et al. Suitability of administrative claims databases for bariatric surgery research – is the glass half-full or half-empty?. BMC Med Res Methodol 20, 225 (2020). https://doi.org/10.1186/s12874-020-01106-8

Download citation

Keywords

  • Bariatric surgery
  • Body mass index
  • Healthcare administrative claims
  • Predictive value of tests
  • Sensitivity and specificity
  • Validation study