Skip to main content


We’d like to understand how you use our websites in order to improve them. Register your interest.

Supplementing claims data with outpatient laboratory test results to improve confounding adjustment in effectiveness studies of lipid-lowering treatments



Adjusting for laboratory test results may result in better confounding control when added to administrative claims data in the study of treatment effects. However, missing values can arise through several mechanisms.


We studied the relationship between availability of outpatient lab test results, lab values, and patient and system characteristics in a large healthcare database using LDL, HDL, and HbA1c in a cohort of initiators of statins or Vytorin (ezetimibe & simvastatin) as examples.


Among 703,484 patients 68% had at least one lab test performed in the 6 months before treatment. Performing an LDL test was negatively associated with several patient characteristics, including recent hospitalization (OR = 0.32, 95% CI: 0.29-0.34), MI (OR = 0.77, 95% CI: 0.69-0.85), or carotid revascularization (OR = 0.37, 95% CI: 0.25-0.53). Patient demographics, diagnoses, and procedures predicted well who would have a lab test performed (AUC = 0.89 to 0.93). Among those with test results available claims data explained only 14% of variation.


In a claims database linked with outpatient lab test results, we found that lab tests are performed selectively corresponding to current treatment guidelines. Poor ability to predict lab values and the high proportion of missingness reduces the added value of lab tests for effectiveness research in this setting.

Peer Review reports


Administrative health insurance claims databases provide comprehensive and longitudinal records of encounters with the health care system and of drug dispensing, but lack clinical detail. For example, while the performance of a lab test will generate a claim, the test result will not be available within the claims database. This shortcoming can be overcome by merging outpatient laboratory test results extracted from electronic medical records (EMR) systems with claims data. Adjusting for lab results may result in better confounding control when administrative claims data are used to study treatment effects of medical products.

A difficulty arises from the way in which lab tests are ordered and performed in the American health care system. EMR systems with outpatient lab results generally rely on major laboratory companies to supply lab results data; results for patients whose tests are conducted outside the large chains may go unrecorded in EMRs. In conducting comparative effectiveness research in claims data, pharmacoepidemiologists generally interpret the absence of a claim as the absence of that service/diagnosis, which can result in a covariate misclassification problem but not a missing data problem [1]. However, missing lab test results do not mean the test was not performed or the results are normal, and thus must be handled like other missing data. There is little guidance in the literature on the nature of the missingness of such laboratory information or whether missing lab test results can be adequately imputed.

Investigators have employed varying strategies to deal with this situation. One approach is to identify a subcohort of patients with complete information on the lab test results of interest. Seeger et al. studied the effectiveness of statin therapy to reduce myocardial infarction rates, by requiring all patients studied to have a recorded LDL > 130 mg/dl [2]. This approach reduces the proportion of subjects with missing data, but that advantage comes at the cost of fewer subjects in the study, and a final study population that may be dissimilar to the broader patient population for important characteristics. Furthermore, this approach is impractical for multiple unrelated lab test results, as complete cases may be few [3].

One unfavorable approach is to include an indicator term for missing lab test data. In the case of lab results, missingness can imply three things: (1) the physician did not see a need to have a test ordered; (2) the patient did not choose to have the test performed, or (3) the test was performed but not in a facility whose data fed back to the patient’s EMR (Figure 1). Though the implication of each of these cases is entirely different, they are indistinguishable in the data; as such, coding them simply as “missing” would lead to bias. Even if all the missingness were due to the third case, in which data are most plausibly missing completely at random, the use of a missing indicator term could still cause bias [4, 5].

Figure 1

Reasons for missing lab test results in a longitudinal healthcare utilization database linked to a lab test provider database*. * In the setting of a new user cohort study with a defined covariate assessment period before the first exposure and before follow-up

In order to further meaningful comparative effectiveness research, we must understand the selectiveness of missing lab test results and how missingness may be related to study outcomes. In this paper, we seek to describe the analytic issues encountered when lab results may not be available for many patients. As an example, we describe the characteristics associated with the absence of laboratory test results and the degree to which missingness and actual lab test values can be predicted based on patient and health plan characteristics in a population of patients initiating lipid-lowering therapy.


Database studies that combine claims and lab test results or other data from EMRs typically employ the claims as a “data backbone,” as claims data provide a longitudinal view of virtually all health care encounters and drug dispensings submitted for health insurance reimbursement. Increasingly, claims databases link data from large national lab test chains [68]. Though the chains service a large number of American patients, the resulting linked data may cover substantially less than 50% of outpatient lab tests, with coverage highly dependent on the region where the patient resides and the lab companies servicing that region. Figure 1 illustrates two levels of missingness that may arise in such situations. No claim will be recorded (Level 1) if a physician does not order a test, a patient receives a lab in a hospital, or a patient does not get a test that was ordered. The result of a test that was performed may not be transmitted to the patient’s claims data (Level 2) if the insurer has not established a data exchange agreement with the laboratory provider. The likelihood of Level 2 missingness increases if there is no laboratory provider operating in the area that has a data exchange agreement with the insurer.

Data sources

We employed longitudinal claims data from 14 Blue Cross and/or Blue Shield-licensed health plans of Wellpoint across 14 US states, as represented in the HealthCore Integrated Research DatabaseSM (HIRDSM). HealthCore linked claims information to lab test results provided by two large national laboratory providers, for laboratory tests performed between January 1, 2005 through June 30, 2010 on patients represented in the HIRD system. The claims data contained information on drug dispensings, outpatient medical services, and hospitalizations including emergency room visits. All medical services were coded with up to 9 discharge diagnoses [1]. Individual laboratory test results were identified by LOINC codes and standardized across lab providers. This study was approved by the Brigham and Women’s Hospital Institutional Review Board and signed data use agreements were in place.

Study cohort and exposure

From the data available, we established a cohort of all incident users of any statin (simvastatin, pravastatin, lovastatin, atorvastatin, rosuvastatin), Vytorin (simvastatin plus ezetimibe), or ezetimibe who were 18 years or older at the start of treatment. Incident use was established by requiring at least 12 months of insurance coverage before treatment and no use of any lipid-lowering therapy in those 12 months. 1 All covariate information was assessed in the longitudinal healthcare claims over a covariate assessment period (CAP) starting 6 months before treatment initiation and up to the day of dispensing of the index drug. Follow-up for occurrence of MI started 1 month after initiation of lipid-lowering treatment, a conservative assumption to allow for the biologic action of the medication to occur (Figure 2) [9]. We categorized each medication on the index date into high and low intensity treatment based on its ability to lower LDL levels (Table 1) [10].

Figure 2

Incident user cohort study*. * The 6-month covariate assessment period (CAP) precedes the initiation of treatment. During the CAP we identified patient characteristics, including lab tests performed and lab test results available. After treatment start followed a 1-month lag period before events were attributed to the treatment. The arrows between prescriptions (Rx), diagnoses (Dx) and lab tests denote the fact that the temporality of events within the CAP was not considered in this study

Table 1 Definitions for high vs. low intensity lipid-lowering therapy

We defined two subgroups of patients with chronic conditions. Rheumatoid arthritis (RA) was defined as at least two outpatient diagnoses of RA in the CAP or one hospital discharge diagnosis of RA in CAP or one diagnosis of RA plus dispensing of a disease modifying anti-rheumatic drug. Diabetes (DM) was defined as at least two outpatient diagnoses of DM in the CAP or one hospital discharge diagnosis of DM in CAP or one diagnosis of DM plus an insulin or oral antidiabetic dispensing. Patients with rheumatoid arthritis and diabetes were identified as subgroups with chronic conditions because these patients were likely to receive more lab tests at regular intervals than the typical patient initiating a statin. Patients with rheumatoid arthritis are of further interest in that they may receive care primarily from a specialist rather than an internist and therefore, may have different patterns of laboratory use.

Patient characteristics and lab test results

Patient characteristics and potential confounders assessed during the 6-month CAP included age (18–40; 41–64; 65+), sex, state of residence, insurance plan type (Health Maintenance Organization, Medicare Advantage, Medicare Supplemental, Preferred Provider Organizations, Indemnity, other), number of physician visits, number of cardiologist visits, number of different drugs used, [11] hospitalization in the 30 days prior to treatment initiation, hospitalization for more than 30 days before treatment initiation, number of days hospitalized, number of outpatient lab test ordered, hypercholesterolemia, hypertension, heart failure, myocardial infarction, coronary revascularization, peripheral vascular disease, peripheral arterial revascularization, TIA/stroke, carotid revascularization, pre-diabetes, diabetes, arthritis, COPD, oxygen canister use, and obesity. Clinical covariates were assessed based on the presence of ICD-9 diagnosis codes (see Additional file 1: Appendix Table S1) in administrative claims during the CAP. In this exploratory analysis, we included a wide range of clinical covariates frequently measured in claims-based studies.

Within the 6 months covariate assessment period we identified all recorded outpatient lab test results for 23 commonly-performed lab tests, including lipid tests, HbA1c, and others (see Additional file 1: Appendix Table S2). Additionally, we used CPT-4 codes to identify all labs for which charges were claimed during the CAP. We chose to include 23 lab tests to increase the probability that patients would have multiple lab tests performed and that we would be able to asses whether lab values were missing at the patient level or the test level.

In comparative effectiveness research, as in other areas of clinical epidemiology, missing data are both common and problematic. Imputation of missing values may increase precision and validity of effect estimates. The imputation literature recommends including not only pre-exposure patient characteristics and treatment information in the prediction of missing values but also information on the outcome status [12]. In our example, outcomes of interest were the incidence of myocardial infarction (assessed with a positive predictive value of 94%); [13] hospitalization for acute coronary syndrome (ACS) that included a coronary revascularization procedure; stroke; and death attributed to any cause (see Additional file 1: Appendix Table S3). Follow-up time started 1 month after initiation of a cholesterol-lowering drug (Figure 2). Patients were censored at the time of discontinuation of the index drug, any of the outcomes, disenrollment, or study end (June 30, 2010), whichever came first.


In this analysis, ascertainment of performing a lab test refers only to tests performed in the outpatient setting. We determined the proportion of patients who had at least one such lab test performed out of the 23 study lab tests and then focused on 3 specific cardiovascular risk markers: LDL, HDL, and HbA1c[14]. In sensitivity analyses, we extended the 6-month covariate assessment period to 9 and to 12 months in an effort to capture more lab test results.

In order to quantify differential lab test performance and result availability, we computed the number of lab tests performed (as measured by the presence of CPT-4 codes) and the proportion of those with test results available in the linked database. We then cross-tabulated these data with patient and system characteristics.

For each of the LDL, HDL, and HbA1c cardiovascular disease risk markers, any factors associated with a completed test were identified in a multivariate logistic regression that predicted whether the outpatient lab test was performed, as a function of the patient and system characteristics described above plus statin/Vytorin exposure and cardiovascular outcome status. We then determined overall sensitivity and specificity for the predicted probabilities of test performance and model c-statistics.

In order to explore the performance of imputation strategies, we fit linear regression models for the patients who had lab test results available, in order to predict the actual LDL, HDL, and HbA1c. In instances where patients received multiple tests, we used the value from the last test. We assumed normal distributions of test results as reasonable approximations, although data were slightly skewed. We express the proportion of explained variation as the observed R2 from the linear regression models.

Lastly, we investigated the relationship between the completion of a lab test, the availability of test results in our database, and whether the test results themselves differed between study exposure groups stratified by RA and diabetes.


Over the study period we identified 703,484 patients who met the study eligibility criteria and initiated lipid-lowering therapy with statins, ezetimibe, or a combination of both. Among those, 68% had a recorded charge for at least one of the 23 study lab tests in the 6 months before treatment (Table 2). This proportion increased to 72% if the covariate assessment period was extended to 9 months before treatment, and to 74% during a 12-month period. For patients with diabetes or RA the proportions were higher (80% during 6 months) but showed equally small increases if the covariate assessment period was extended (83% and 84%). For LDL and HbA1c tests, the proportion of patients with a recorded charge for at least one test during the 6 months before initiation of lipid-lowering therapy was about 60% and 17%, respectively. For patients with diagnosed diabetes, 68% had a charge for an HbA1c test (Table 2).

Table 2 Number of patients with at least one lab test performed and claimed among 703,484 initiators of statins and/or ezetimibe using 6, 9, and 12-month confounder assessment periods (CAPs)

Overall and regardless of having a test performed, the proportion of patients with any outpatient lab test results available in the linked database was about 30%, which was similar in patients with diabetes or RA. Lab test results for LDL or HDL were available for about 20% of patients during the 6 months before initiation of lipid-lowering therapy.

Table 3 shows whether any of 23 outpatient lab tests, including LDL, HDL and HbA1c were performed within the 6 months before initiating lipid-lowering therapy cross-tabulated by a range of patient and health system characteristics. Overall, 481,133 (68%) of study patients had claims evidence of an outpatient lab test and 42% thereof had results available in the study data (29% of all patients). The proportion with at least one lab test performed varied substantially by patient characteristics, while test result availability varied little, and only for variables such as system characteristics and state of residence (Table 3).

Table 3 Patients with lab test results reported in study population of 703,484 initiators of statins or ezetimibe using a 6-month covariate assessment period*

Having been hospitalized in the 30 days before the initiation of lipid-lowering treatment was negatively associated with receiving an outpatient test, likely because the relevant lab tests were performed during the hospitalization and as such do not appear as outpatient lab tests. Some patients hospitalized for acute coronary syndrome or MI may have received lipid-lowering therapy for secondary prevention without the need for a lab test. This is supported by the fact that patients with both recent MI and ACS had a lower than average proportion with at least one test performed (24% and 40% compared with an average of 68%). A code for hypercholesterolemia is frequently accompanied by an LDL test performed (80%) likely because the test ordering is accompanied with such a billing code.

Patients with Medicare Supplemental coverage (43,645) had a much lower proportion of claims for LDL tests performed (18%), and of those only 2% had results available. The lab test provider may not have included the secondary payer on the claim.

The two lab test providers that provided data to the insurer do not operate in some states; for example, the availability of lab test results in one state was as low as 2% for LDL. Such low recording would not be dependent directly on patient characteristics as it affects an entire state and is driven by factors other than health status, though clinically relevant patient characteristics have varying prevalences across states.

Some patients resided in states not primarily covered by the health plan studied, and are covered only via accounts for nationally operating businesses (e.g., if the employer is based in another state, all employees may be members of a health plan in that other state, rather than the state of residence). For these patients, the availability of LDL test results is less than 10%. One state (#12) stands out as having a small proportion of patients with an outpatient LDL test performed (24%), but a much larger proportion of patients have a result available (51%). In this state, a larger proportion of providers are under HMO capitation agreements. Within these plans, under-recording of tests performed may be the result of bundled payment arrangements; however, results are still forwarded by the lab test providers resulting in the paradox of having more lab test results available in our database than performed as recorded in claims data. Among patients with Diabetes or RA, we found fundamentally similar results. Among elderly patients lab test results were more likely to be available among Medicare advantage enrollees than those patients covered through Medicare supplemental insurance (Table 4).

Table 4 Cross-tabulation between states and Medicare Advantage and Medicare Supplement status

Based on patient and system characteristics plus exposure and outcome status it was possible to predict with high sensitivity (97%) and specificity (94%) whether outpatient lab tests were performed in the 6 months before treatment initiation. The corresponding model c-statistics of the logistic regression models were between 0.89 and 0.93 (Figure 3), indicating a very high predictive capacity. Strong independent associates of having an outpatient LDL test performed were a diagnosis of hypercholesterolemia or obesity, and carotid revascularization. Associates of low probability of doing LDL lab tests were recent hospitalization and being diagnosed with RA. Being older than 65 also decreased the chance of an LDL lab test, likely because of test underreporting due to bundled payments. Initiating high-intensity lipid-lowering treatment and dying in the study follow-up period were correlates of not having an outpatient LDL, HDL, or HBA1c test performed. Not surprisingly, the strongest predictor of having an HBA1c test performed was a diagnosis of diabetes or pre-diabetes.

Figure 3

Associates of selected outpatient lab tests performed in patients initiating lipid-lowering treatment according to claims data in 703,484 patients from a logistic regression model (darker means stronger association)

Among the patients for whom LDL, HDL, or HBA1c test levels were available, we then attempted to predict the actual lab levels based on their recorded patient and system characteristics. Using all observed factors described above, 17% of the variation could be explained (Figure 4). Young age was the strongest correlate of increased high LDL (+20 mg/dl) and HbA1c (+0.5%) levels, suggesting that in younger age initiation of lipid-lowering therapy was more driven by lab test results, i.e. primary prevention, while in older age past coronary events and other risk factors were the triggers for statin initiation despite lower LDL levels (−17 mg/dl).

Figure 4

Correlates of selected lab test results among patients with lab test results available (darker means stronger correlations)

Higher intensity of lipid lowering treatment generally was correlated with a lower proportion of outpatient LDL tests performed, a lower fraction of LDL test results available in the database, and lower LDL serum levels (Table 5). For example, among high dose simvastatin initiators (>40 mg/day), 52% had an outpatient LDL test performed before treatment start (63% for lower dose simvastatin). Of those patients, 37% had a test result available (42%), and the mean LDL serum level was 135.6 mg/dl compared to 147.3 mg/dl for patients started on low-intensity simvastatin. Mean LDL levels were generally lower in patients with diabetes who initiated lipid-lowering therapy.

Table 5 LDL tests performed and LDL test results available by lipid lowering treatment in patients initiating lipid-lowering therapy, including patient subgroups with diabetes (DM) or rheumatoid arthritis (RA)


We studied the characteristics of laboratory test information in a pharmacoepidemiologic research data source that enriches longitudinal claims data with outpatient lab test results data, which makes it possible to better adjust for biomarkers of cardiac risk in comparative effectiveness studies. In an example cohort study of 703,484 patients initiating various lipid-lowering therapies, 68% of patients had at least one of a set of 23 study lab tests performed in the 6 months before treatment, and 42% of those had test results available. LDL test results were available for 24% of statin initiators, a non-trivial level of missingness that needed to be addressed in order to preserve the validity and generalizability of findings. Missingness due to absence of lab tests being performed followed a complex pattern that is largely explained by hospitalization, clinical practice guidelines which differ for primary and secondary prevention of coronary heart disease, and by some health care system characteristics.

Several key points regarding these patterns arose and have implications for conducting comparative effectiveness research studies in such enriched data sources.

Operational aspects

A covariate assessment period of 6 months was sufficient to capture the majority of outpatient lab tests performed. Extending the period to 9 and 12 months, and thus extending the required pre-exposure enrollment period, provided few additional observed lab tests but may disproportionally reduce the cohort size if working with health plans that have high enrollee turnover rates. Patients with existing chronic conditions like RA or diabetes have more outpatient lab tests available if their healthcare providers monitor them more closely. Recent hospitalizations strongly decreased the number of outpatient lab tests. It is likely that tests were performed during the hospitalization, and if the test results were available to the patient's primary care physician, repeating testing may not have been required for some time after discharge.

Selectiveness of lab tests performed

Patients with risk factors for cardiovascular events were less likely to have lab tests performed. Many patients with these characteristics receive lipid-lowering treatment as secondary prevention, which is initiated independent of serum lipid-levels more frequently than is primary prevention. Indeed, treatment guidelines in place since the late 1990s recommend that patients with a major cardiac event should be treated with lipid lowering medications [15, 16]. In a prior study, patients who initiated high-intensity lipid-lowering treatment were less likely to have had an outpatient lab test performed [17]. Because the presence of preexisting cardiac risk factors is both a strong predictor of future events and a predictor of missing data on lipid levels, disregarding the missing information can be expected to bias findings of non-randomized comparative effectiveness research in this setting.

Selectiveness of lipid lab test results available

Among patients who had lab test result available, those who were subsequently initiated on higher-intensity lipid-lowering treatment were more likely to have lower lipid serum levels. This finding is again compatible with clinical practice and trial findings that patients with acute coronary events (who are less likely to have outpatient lipid tests available) should be treated with high-intensity statins largely independent of their lipid levels [17, 18]. System factors like state of residence and insurance plan type, particularly supplemental insurance, may substantially influence the availability of test results. However, since those factors are less likely to be systematically related to health outcomes it is unlikely that these will act as major confounding factors in comparative effectiveness studies.

In addition to these limitations regarding lab tests, baseline clinical conditions may be under-reported through claims data in some patients, particularly when using a short ascertainment period, such as the 6-month period we used. For LDL, HDL, and HBA1c tests it is unlikely that point-of-care testing would be performed, which the lab test provider chain would not record. However, other tests, like INR, urine analyses and creatinine levels might be subject to this additional limitation.

If replicated in other patient populations and datasets, these findings have important implications for CER studies. The complexity of the nature of missingness that logically follows from clinical practice and the reality of our health care system requires the inclusion of a wide variety of patient and system characteristics in order to model the missing data structure. In our specific example the combination of primary and secondary prevention with lipid-lowering medications seems to complicate the prediction of missing values, but in the end is likely a reason why we could differentiate so well between patients who have an outpatient LDL test performed versus not (Figure 5). Once the outpatient lab test results were available, we had moderate ability to predict the exact lipid/HbA1c serum level. The resulting mismeasurement of imputed lab test results suggests that imputation of test results would provide only limited additional confounding control. However, estimation precision would be increased because the analyzable population would more than triple in our example study.

Figure 5

Ability of longitudinal claims data to predict whether a lab test was performed, a test result was available, and the actual serum level for three biomarkers of cardiovascular risk*. * c-statistics were computed from multivariate logistic regression models including patient factors measured during 6 months before lipid-lowering treatment initiation; r2 measured were computed only among patients who had a lab test result available from linear regression including patient factors measured during 6 months before lipid-lowering treatment initiation, actual treatment choice, as well as cardiovascular events and death during follow-up

It is likely that the specific patterns of missingness of outpatient lab test results will vary depending on the clinical scenario, health care practice, and system constraints. It is encouraging that despite the non-random missingness we were able to predict quite well who would and would not receive a lab test result, which is a good starting point for addressing this issue. However, our difficulty in predicting actual lab values is a challenge to incorporating lab data through imputation or weighting approaches in comparative effectiveness research studies.


In a claims database linked with outpatient lab test results, we found that lab tests are performed selectively depending on patient risk factors and corresponding to current treatment guidelines. Poor ability to predict lab values and the high proportion of missingness reduces the added value of lab tests for effectiveness research in this setting.


Dr. Schneeweiss is Principal Investigator of the Brigham and Women’s Hospital DEcIDE Center on Comparative Effectiveness Research and the DEcIDE Methods Center both funded by AHRQ and of the Harvard-Brigham Drug Safety and Risk Management Research Center funded by FDA. Dr. Schneeweiss is paid consultant to WHISCON and Booz & Co, and he is principal investigator of investigator initiated grants to the Brigham and Women’s Hospital from Pfizer, Novartis, and Boehringer-Ingelheim. Drs. Daniel was and Singer is employed by HealthCore, a subsidiary of WellPoint.


  1. 1.

    Schneeweiss S, Avorn J: A review of uses of health care utilization databases for epidemiologic research on therapeutics. J Clin Epidemiol. 2005, 58: 323-337. 10.1016/j.jclinepi.2004.10.012.

    Article  PubMed  Google Scholar 

  2. 2.

    Seeger JD, Walker AM, Williams PL, Saperia GM, Sacks FM: A propensity score-matched cohort study of the effect of statins, mainly fluvastatin, on the occurrence of acute myocardial infarction. Am J Cardiol. 2003, 92: 1447-1451. 10.1016/j.amjcard.2003.08.057.

    CAS  Article  PubMed  Google Scholar 

  3. 3.

    Mulla ZD, Seo B, Kalamegham R, Nuwayhid BS: Multiple imputation for missing laboratory data: an example from infectious disease epidemiology. Ann Epidemiol. 2009, 19: 908-914. 10.1016/j.annepidem.2009.08.002.

    Article  PubMed  Google Scholar 

  4. 4.

    Greenland S, Finkle WD: A critical look at methods for handling missing covariates in epidemiologic regression analyses. Am J Epidemiol. 1995, 142: 1255-1264.

    CAS  PubMed  Google Scholar 

  5. 5.

    Vach W, Blettner M: Biased estimation of the odds ratio in case–control studies due to the use of ad hoc methods of correcting for missing values for confounding variables. Am J Epidemiology. 1991, 134: 895-907.

    CAS  Google Scholar 

  6. 6.

    Neri L, Rocca Rey LA, Lentine KL, et al: Joint association of hyperuricemia and reduced GFR on cardiovascular morbidity: a historical cohort study based on laboratory and claims data from a national insurance provider. Am J Kidney Dis. 2011, 58: 398-408. 10.1053/j.ajkd.2011.04.025.

    Article  PubMed  Google Scholar 

  7. 7.

    Laitinen DL, Manthena S: Impact of change in high-density lipoprotein cholesterol from baseline on risk for major cardiovascular events. Adv Ther. 2010, 27: 233-244. 10.1007/s12325-010-0019-4.

    CAS  Article  PubMed  Google Scholar 

  8. 8.

    McCullough E, Sullivan C, Banning P, Goldfield N, Hughes J: Challenges and benefits of adding laboratory data to a mortality risk adjustment method. Qual Manag Health Care. 2011, 20: 253-262.

    Article  PubMed  Google Scholar 

  9. 9.

    Schneeweiss S: A basic study design for expedited safety signal evaluation based on electronic healthcare data. Pharmacoepidemiol Drug Saf. 2010, 19: 858-868. 10.1002/pds.1926.

    Article  PubMed  PubMed Central  Google Scholar 

  10. 10.

    Choudhry NK, Levin R, Winkelmayer WC: Statins in elderly patients with acute coronary syndrome: an analysis of dose and class effects in typical practice. Heart. 2007, 93: 945-951. 10.1136/hrt.2006.110197.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  11. 11.

    Schneeweiss S, Seeger JD, Maclure M, Wang PS, Avorn J, Glynn RJ: Performance of comorbidity scores to control for confounding in epidemiologic studies using claims data. Am J Epidemiol. 2001, 154: 854-864. 10.1093/aje/154.9.854.

    CAS  Article  PubMed  Google Scholar 

  12. 12.

    Moons KG, Donders RA, Stijnen T, Harrell FE: Using the outcome for imputation of missing predictor values was preferred. J Clin Epidemiol. 2006, 59: 1092-1101. 10.1016/j.jclinepi.2006.01.009.

    Article  PubMed  Google Scholar 

  13. 13.

    Kiyota Y, Schneeweiss S, Glynn RJ, Cannuscio CC, Avorn J, Solomon DH: Accuracy of Medicare claims-based diagnosis of acute myocardial infarction: estimating positive predictive value on the basis of review of hospital records. Am Heart J. 2004, 148: 99-104. 10.1016/j.ahj.2004.02.013.

    Article  PubMed  Google Scholar 

  14. 14.

    Kannel WB, Wilson PW: Efficacy of lipid profiles in prediction of coronary disease. Am Heart J. 1992, 124: 768-774. 10.1016/0002-8703(92)90288-7.

    CAS  Article  PubMed  Google Scholar 

  15. 15.

    Braunwald E, Antman EM, Beasley JW, et al: ACC/AHA guidelines for the management of patients with unstable angina and non-ST-segment elevation myocardial infarction: executive summary and recommendations. A report of the American College of Cardiology/American Heart Association task force on practice guidelines (committee on the management of patients with unstable angina). Circulation. 2000, 102: 1193-1209. 10.1161/01.CIR.102.10.1193.

    CAS  Article  PubMed  Google Scholar 

  16. 16.

    Kushner FG, Hand M, Smith SC, et al: 2009 Focused Updates: ACC/AHA Guidelines for the Management of Patients With ST-Elevation Myocardial Infarction (updating the 2004 Guideline and 2007 Focused Update) and ACC/AHA/SCAI Guidelines on Percutaneous Coronary Intervention (updating the 2005 Guideline and 2007 Focused Update): a report of the American College of Cardiology Foundation/American Heart Association Task Force on Practice Guidelines. Circulation. 2009, 120: 2271-2306. 10.1161/CIRCULATIONAHA.109.192663.

    Article  PubMed  Google Scholar 

  17. 17.

    Cannon CP, Braunwald E, McCabe CH, et al: Intensive versus moderate lipid lowering with statins after acute coronary syndromes. N Engl J Med. 2004, 350: 1495-1504. 10.1056/NEJMoa040583.

    CAS  Article  PubMed  Google Scholar 

  18. 18.

    de Lemos JA, Blazing MA, Wiviott SD, et al: Early intensive vs a delayed conservative simvastatin strategy in patients with acute coronary syndromes: phase Z of the A to Z trial. JAMA. 2004, 292: 1307-1316. 10.1001/jama.292.11.1307.

    CAS  Article  PubMed  Google Scholar 

Pre-publication history

  1. The pre-publication history for this paper can be accessed here:

Download references


Funded by HealthCore Inc. through the Brigham-HealthCore Methods Development Collaboration, the National Library of Medicine (RO1-LM010213), the National Center for Research Resources (RC1-RR028231) and the National Heart Lung and Blood Institute (RC4-HL106373). Dr. Rassen was supported by a career development award from AHRQ (K01-HS018088).

Many people have contributed to the understanding of the data sources and helped interpret the data: Marcus Wilson, PharmD, Peter Wahl, MA, MSCE, Niteesh K. Choudhry MD, PhD.

Author information



Corresponding author

Correspondence to Sebastian Schneeweiss.

Additional information

Competing interests

Dr. Schneeweiss is Principal Investigator of the Brigham and Women’s Hospital DEcIDE Center on Comparative Effectiveness Research and the DEcIDE Methods Center both funded by AHRQ and of the Harvard-Brigham Drug Safety and Risk Management Research Center funded by FDA. Dr. Schneeweiss is paid consultant to WHISCON LLC and Booz & Co, and he is principal investigator of investigator-initiated grants to the Brigham and Women’s Hospital from Pfizer, Novartis, and Boehringer-Ingelheim unrelated to the topic of this study.

Authors’ contributions

SS, and JAR conceived the idea through their interests in improving confounding adjustment by adding information from electronic medical records to claims databases. All authors supported the design, analysis, and interpretation in various ways. All authors critically reviewed the drafts and approved the final version.

Electronic supplementary material

Authors’ original submitted files for images

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and Permissions

About this article

Cite this article

Schneeweiss, S., Rassen, J.A., Glynn, R.J. et al. Supplementing claims data with outpatient laboratory test results to improve confounding adjustment in effectiveness studies of lipid-lowering treatments. BMC Med Res Methodol 12, 180 (2012).

Download citation


  • Insurance claims data
  • Laboratory test results
  • Serum lipid levels
  • Confounding
  • Imputation
  • Pharmacoepidemiology
  • Lipid lowering therapy
  • Statin
  • Ezetimibe