Skip to main content
  • Research article
  • Open access
  • Published:

Validation of a case definition to define chronic dialysis using outpatient administrative data



Administrative health care databases offer an efficient and accessible, though as-yet unvalidated, approach to studying outcomes of patients with chronic kidney disease and end-stage renal disease (ESRD). The objective of this study is to determine the validity of outpatient physician billing derived algorithms for defining chronic dialysis compared to a reference standard ESRD registry.


A cohort of incident dialysis patients (Jan. 1 - Dec. 31, 2008) and prevalent chronic dialysis patients (Jan 1, 2008) was selected from a geographically inclusive ESRD registry and administrative database. Four administrative data definitions were considered: at least 1 outpatient claim, at least 2 outpatient claims, at least 2 outpatient claims at least 90 days apart, and continuous outpatient claims at least 90 days apart with no gap in claims greater than 21 days. Measures of agreement of the four administrative data definitions were compared to a reference standard (ESRD registry). Basic patient characteristics are compared between all 5 patient groups.


1,118,097 individuals formed the overall population and 2,227 chronic dialysis patients were included in the ESRD registry. The three definitions requiring at least 2 outpatient claims resulted in kappa statistics between 0.60-0.80 indicating "substantial" agreement. "At least 1 outpatient claim" resulted in "excellent" agreement with a kappa statistic of 0.81.


Of the four definitions, the simplest (at least 1 outpatient claim) performed comparatively to other definitions. The limitations of this work are the billing codes used are developed in Canada, however, other countries use similar billing practices and thus the codes could easily be mapped to other systems. Our reference standard ESRD registry may not capture all dialysis patients resulting in some misclassification. The registry is linked to on-going care so this is likely to be minimal. The definition utilized will vary with the research objective.

Peer Review reports


The global prevalence of end-stage renal disease (ESRD) requiring treatment with dialysis or kidney transplantation continues to increase [1, 2]. Patients with ESRD experience far greater morbidity, mortality and health care costs than members of the general population, and studies evaluating health outcomes in this high-risk population are required worldwide [1, 2].

Administrative health care databases offer an efficient and accessible approach to studying outcomes in large populations[3]. Physician billing claims data are one data source for identifying cases of ESRD because they are routinely collected for physician reimbursement, often span wide geographic areas, and have the potential to capture both in-hospital and outpatient encounters within a healthcare system[4]. However, before such data sources can be widely adopted for use in research where identification of cases of ESRD is critical, the validity of algorithms used to define case definitions of ESRD requires evaluation.

Limited data demonstrate the validity of administrative data algorithms for identifying patients requiring chronic hemodialysis or peritoneal dialysis. Prior studies have assessed acute kidney injury [57], as well as the validity of using inpatient administrative data to identify chronic dialysis patients [813]. The two previous studies considering chronic dialysis in the outpatient setting have considered diagnostic codes [14, 15], not procedural codes as are considered in this study. This is of particular importance as the majority of contemporary ESRD patients receive chronic dialysis as outpatients. We therefore did this study to determine the validity of algorithms derived from outpatient physician billing claims for defining chronic dialysis, compared to the reference standard of an ESRD registry.


Study Population

A cohort was identified from the Alberta Kidney Disease Network (AKDN - laboratory database to form the study population. The AKDN is a prospective data collection initiative of routine laboratory tests on all patients in the province of Alberta (population approx. 3 million) Canada, resulting in a population-based geographically inclusive database [16]. Patients identified from laboratory data are followed prospectively with linkage to administrative and other computerized sources to obtain detailed information including socio-demographic data, clinical data including comorbidities, health care encounters, health care costs, death, and kidney-related outcomes. The study cohort included patients aged 18 and older who had at least 1 outpatient serum creatinine between Jan 1 2008 and Dec 31 2008. Although a general population cohort would be optimal, our selected study population introduces minimal, if any, bias as anyone "at-risk" of ESRD or evaluated for or receiving chronic dialysis was expected to have received serum creatinine measurement as part of their routine clinical assessment.

Data sources

Patients treated for ESRD in Alberta are cared for by the Northern Alberta (NARP) and Southern Alberta (SARP) Renal Programs [17]. These programs are responsible for providing ESRD care including chronic dialysis within their geographic area. Each program maintains a prospective patient registry of all chronic dialysis patients, and captures detailed demographic and clinic data, including date of initiation of dialysis. Patients are enrolled at the time of first dialysis for ESRD (first hemodialysis session or first flushing of peritoneal dialysis catheter), or, for patients who initiate dialysis for acute kidney injury, when the attending nephrologist deems that dialysis will be chronic. The NARP and SARP registries were used to identify prevalent and incident dialysis patients from January 1, 2008 to December 31, 2008 (considered the reference standard). Prevalent cases were first identified on Jan 1 1999, with additional incident dialysis patients identified from that date forward. Non- Alberta residents were excluded.

Physicians in Alberta submit claims for reimbursement of services to Alberta Health and Wellness, the provincial health ministry, (the universal health care provider for the province of Alberta); claims are stored in a database which contains information on patients' personal health number, physician unique identifier, up to 3 ICD-9 diagnosis codes and 1 procedure code. Procedure codes are captured using the Canadian Classification of Diagnostic, Therapeutic and Surgical Procedures (CCP, which was developed by Statistics Canada to accompany the International Classification of Diseases version 9 (ICD-9) [18]. Physician claims capture all of the outpatient physician services and the majority of the inpatient services. All chronic dialysis patients in the province of Alberta are cared for by nephrologists, who are compensated either using a fee-for-service or salaried model. Regardless of compensation method, physicians are required to submit claims for all patient encounters.

Defining Chronic Dialysis using Administrative Data

We identified all patients with outpatient dialysis physician claims (Table 1) occurring from Jan 1 2008 to Dec 31 2008. We evaluated 4 different case definitions for chronic dialysis patients based on varying the number and timing of physicians claims for dialysis: 1) At least 1 outpatient claim, 2) At least 2 outpatient claims, 3) At least 2 outpatient claims at least 90 days apart and 4) Continuous outpatient claims at least 90 days apart with no gap in claims greater than 21 days. We evaluated algorithms employing a 90 day period of claims to be congruent with other current administrative data definitions developed using inpatient data [19, 20].

Table 1 Administrative billing codes used to define outpatient chronic dialysis

Comorbidities and other outcomes

Demographic data were determined from the provincial administrative data files. Diabetes mellitus and hypertension were identified from hospital discharge records and physician claims based on validated algorithms [4, 21]. The Charlson comorbidities were calculated using the validated algorithms applied to physician claims and hospitalization data [22, 23]. Any comorbidity identified during the 3 year period prior to cohort entry was included. To ascertain death, patients were followed up from their start date of dialysis, defined either by the first recorded date in the registry or the date of the second of the outpatient claims when the administrative data definition was used, until March 31, 2009 to ensure a minimum of 90 days of follow-up for all patients. Patients who met the case definition and subsequently died or moved out of the province (lost to follow-up) were included in analyses

Statistical Analysis

Basic descriptive statistics were used to describe demographic features and comorbidities for the overall cohort, the NARP/SARP dialysis cohort. Table 2 outlines the analytic framework adopted. We subsequently calculated positive agreement, sensitivity, positive predictive value (PPV), for each case definition, using the NARP/SARP registry data as the reference standard [24]. Positive agreement is the conditional probability, given the reference standard is positive, the administrative data definition is also positive [25]. Thus, the positive agreement will explore if there is an imbalance between the likelihood of agreeing on positive and negative cases. The kappa-statistic was used to assess overall agreement between the registry and the billing data. Landis and Koch categorize Kappa into five categories: less than 0.2 indicating "poor agreement", 0.21 to 0.40 indicating "fair agreement", 0.41 to 0.60 indicating "moderate agreement, 0.61 to 0.80 indicating "substantial agreement" and greater than 0.81 indicating "near perfect agreement"[26]. We did not report specificity, negative agreement, or negative predictive value (NPV) as the large size of the non-diseased population (n = 1.11 million) and low incidence of ESRD in the general population makes these measures insensitive to changes in the case definitions. SAS version 9.2 was used for all analyses. Ethics approval was obtained from the Conjoint Health Research Ethics Board at the University of Calgary.

Table 2 Reported measures of agreement: analytic framework


In total 1,118,097 individuals had at least 1 out-patient serum creatinine measure from Jan 1 2008 to Dec 31 2008. During that period 2,227 chronic dialysis patients (0.20% of the total study population) were registered in the ESRD registry. Table 3 presents the baseline characteristics of the overall population, the reference standard dialysis cohort and the cohort resulting from each of the administrative data definitions. The characteristics of the overall cohort are similar to the general Alberta population [27]. As expected, the dialysis cohort was older (64.0 vs. 52.6 y), had a higher prevalence of diabetes (54.5% vs. 12.7%), hypertension (89.0% vs. 34.7%) and a higher burden of comorbid disease (median number of Charlson comorbidities 3 vs. 0) compared to the total population. As the administrative data definition became more restrictive, the cohort became slightly older with a moderately higher burden of diabetes and hypertension.

Table 3 Baseline cohort characteristics

The chronic dialysis case definitions based on 1 outpatient claim and 2 outpatient claims resulted in similar prevalence estimates to the reference standard (0.21% and 0.19% respectively). The other two definitions, incorporating claims spanning 90 days, underestimated the prevalence (Table 4). The positive agreement was highest when the definition using 2 outpatient claims was considered. The four coding algorithms for dialysis resulted in sensitivities ranging from 0.58 (Continuous outpatient claims) to 0.81 (at least 1 outpatient claim). The PPVs ranged from 0.77 (at least 1 outpatient claim) to 0.86 (Continuous outpatient claims). The three definitions requiring at least 2 outpatient claims resulted in kappa statistics between 0.60-0.80 indicating "substantial" agreement [26]. "At least 1 outpatient claim" resulted in "excellent" agreement with a kappa statistic of 0.81, however, given the size of the true negative population this must be interpreted with caution [24].

Table 4 Validity of physician billing chronic dialysis case definitions compared with reference standard registry case definition


All four physician claims-based case definitions assessed resulted in "substantial" agreement with our reference standard registry definition for chronic dialysis. One outpatient claim for dialysis was the most sensitive definition, while more complicated definitions exhibited modest increases in positive predictive value. The optimal administrative data definition may vary with the research objective. For example, when seeking to maximize identification of dialysis as an outcome an approach based on at least 1 outpatient claim may be preferable. In contrast, when establishing a cohort of patients with ESRD receiving chronic dialysis that includes the fewest non-diseased cases being captured, the use of continuous outpatient claims may be better suited.

Some of the discrepancies between our registry and physician claims algorithms for chronic dialysis likely relate to differences in the classification of patients who receive temporary dialysis or who die soon after initiating dialysis Traditionally, administrative algorithms and national registries, such as the USRDS, have required a 90-day timeframe to define chronic dialysis [19, 20]. Although this approach avoids identification of patients who receive temporary dialysis then recover renal function within 3 months, it introduces survivor bias and does not capture chronic dialysis patients that may begin dialysis but die before meeting the inclusion criteria of the definition. Our study demonstrates that approaches based on 1 or 2 outpatient dialysis claims are substantially more sensitive than definitions based on 90 days of claims, although this definition may include some patients who would not be classified as receiving chronic dialysis in a registry (false positive cases). Utilizing a definition that does not require the patient to survive a certain amount of time eliminates any potential survival bias and allows studies of the patient group that begin dialysis and die soon after. However the limitation of this definition is that it may also include patients with acute kidney injury requiring dialysis for a short period who subsequently recover their renal function and no longer require dialysis. Furthermore, estimates of disease incidence and outcomes will not be comparable to studies based on most existing national registries.

Establishing the validity of an outpatient administrative data definition for chronic dialysis will allow researchers to utilize physician billing claims data to assess outcomes and form cohorts. This is of international relevance, even in countries where established dialysis registries are available. In the United States, not all researchers have the means to access the USRDS. In other registries from other countries often only cross-sectional, regional data with limited outcomes are available. Thus, validated methods for identifying chronic dialysis patients using billing claims data would be useful for in health services research.

We found that the use of physician claims data resulted in the classification of patients as receiving dialysis who were not identified as such in our registry (false positives). Most of these patients were removed from the case definition when algorithms which required claims to span 90 days were used. This is in-keeping with the hypothesis that these events may be acute kidney injury cases or patients who were initiated on dialysis but subsequently recovered renal function; i.e., those not considered chronic dialysis patients and thus not captured in the registry. We also found that physician claims failed to identify some patients captured in the registry (false negatives). As Alberta Health and Wellness does not employ any formal quality assurance or correction process, this may be due to missed billings, billing errors, billings made by physicians on alternative payment plans (shadow billing) or miscoding present in administrative data sources, as the number of such patients decreased when algorithms that required less intensive physician claims were employed.

To our knowledge, this is the first study to look at using outpatient administrative data sources using procedure codes to define chronic dialysis. Others have developed algorithms for acute kidney injury and chronic kidney disease using inpatient administrative data [513]. Given that the majority of chronic dialysis patients are treated in the outpatient setting, administrative data algorithms limited to inpatient encounters are likely to perform poorly when compared against a reference standard. Three previous studies have included outpatient claim data [14, 15, 28]. However, Kern et al. excluded chronic dialysis patients, focusing on the validity of administrative data to define chronic kidney disease defined by eGFR <60 ml/min/1.73 m2 [28]. Neither Weintraub et al. nor Wilchesky et al. included procedural codes [14, 15]. Their work was limited to ICD-9-CM diagnosis codes for chronic renal failure. Thus, our study is novel, and could facilitate further health services research in a high risk population with ESRD who experience very high morbidity, mortality, and health care costs.

Our study does have several limitations. First, the billing codes used are from the Canadian Classification of Diagnostic, Therapeutic and Surgical Procedures (CCP); a classification system developed and applied in Canada. However, most countries have similar billing practices and billing codes that could be mapped to the CCP codes. Second, we used a provincial registry of all chronic dialysis patients as the reference standard. Although this registry is geographically inclusive, some dialysis patients may be omitted from the registry in error, thereby resulting in misclassification. However, as this registry is linked to ongoing dialysis treatment, the number of patients not registered is expected to be small. Third, our study did not distinguish between dialysis modalities (hemodialysis versus peritoneal dialysis, or in-centre versus home dialysis), and the accuracy of patient registry and physician claims in these settings may vary. However, prior research has reported limitations in the accuracy of administrative data for identifying the timing of changes between dialysis modalities suggesting that administrative data sources may be better suited to the general identification of patients receiving chronic dialysis rather than a specific modality [29].


We found that outpatient physician claims identified patients receiving chronic dialysis with "substantial" agreement to a reference standard dialysis registry definition. The use of 1 or 2 outpatient claims was most sensitive; however, had modestly lower positive predictive value than claims spanning 90 days or continuous claims. Given the variation in the way clinicians, researchers, and research tools define chronic dialysis, the optimal physician claims based definition will vary with the research objective.


  1. Levey A, Atkins R, Coresh J, et al: Chronic kidney disease as a global public health probelm: approaches and initiatives - a position statement from Kidney Disease Improving Global Outcomes. Kidney Int. 2007, 72: 247-59. 10.1038/

    Article  CAS  PubMed  Google Scholar 

  2. National, Kidney, Foundation: K/DOQI clinical practive guidelines for chronic kidney disase: evaluation, classification and stratification. Am J Kidney Dis. 2002, 39: S1-266. 10.1016/S0272-6386(02)70081-4.

    Google Scholar 

  3. Needham D, Scales D, Laupacis A, Pronovost P: A systematic review of the Charlson comorbidty index using Canadian administrative databases: a perspective on risk adjustment in critical care research. Journal of Critical Care. 2005, 20: 12-9. 10.1016/j.jcrc.2004.09.007.

    Article  PubMed  Google Scholar 

  4. Quan H, Khan N, Hemmelgarn B, Tu K, Chen G, Campbell N, et al: Validation of a case definition to define hypertension using administrative data. Hypertension. 2009, 54: 1423-8. 10.1161/HYPERTENSIONAHA.109.139279.

    Article  CAS  PubMed  Google Scholar 

  5. Juurlink D, Preyra C, Croxford R, Chong A, Austin P, Tu J, et al: Canadian Institute for Health Information Discharge Abstract Database: A avlidation Study. Institute for Clinical Evaluative Sciences. 2006

    Google Scholar 

  6. Liangos O, Wald R, O'Bell J, Price L, Pereira B, Jaber B: Epidemiology and outcomes of acute renal failure in hospitalized patients: a national survey. Clin J Soc Nephrology. 2006, 1: 43-51. 10.2215/CJN.00220605.

    Article  Google Scholar 

  7. Waikar S, Wald R, Chertow G, Curhan G, Winkelmayer W, Liangos O, et al: Validity of Internatioal Classification of Diseases, Ninth Revision, Clinical Modification Codes for acute renal failure. Journal of the American Society of Nehprology. 2006, 17 (6): 1688-94. 10.1681/ASN.2006010073.

    Article  Google Scholar 

  8. Humphries K, Rankin J, Carere R, Buller C, Kiely F, Spinelli J: Comorbidity data in outcomes research: are clinical data derived from administrative databases a reliable alternative to chart review? Journal of Clinical Epidemiology. 2000, 53 (4): 343-9.

    Article  CAS  PubMed  Google Scholar 

  9. Lee D, Donovan L, Austin P, Gong Y, Liu P, Rouleau J, et al: Comparison of coding of heart failure and comorbidities in administrative and clinical data for use in outcomes research. Med Care. 2005, 43 (2): 182-8. 10.1097/00005650-200502000-00012.

    Article  PubMed  Google Scholar 

  10. Parker J, Li Z, Damberg C, Danielson B, Carlisle D: Administrative versus clinical data for coronary artery bypass graft surgery reprto cards: the view from California. Med Care. 2006, 44: 687-95. 10.1097/01.mlr.0000215815.70506.b6.

    Article  PubMed  Google Scholar 

  11. Quan H, Parsons G, Ghali W: Validity of procedure codes in International Classification of Diseases, 9th Revision, Clinical Modification administrative data. Med Care. 2004, 42 (8): 801-9. 10.1097/01.mlr.0000132391.59713.0d.

    Article  PubMed  Google Scholar 

  12. Romano P, Remy L, Luft H: Second Report of the California Hospital Outcomes Project: Acute Myocaridal Infarction Volume 2: Technical Appendix. 1996, Centre for healthcare Policy and Research California Office of Statewide Health Planning and Development, Chapter 13-15:

    Google Scholar 

  13. Winkelmayer W, Schneeweiss S, Mogun H, Patrick A, Avorn J, Solomon D: Identification of individuals with CKD from Medicare claims data: a validation study American Journal of Kidney Diseases. 2005, 46 (2): 225-32.

    Google Scholar 

  14. Weintraub W, Deaton C, Shaw L, Mahoney E, Morris D, Saunders C, et al: Can Cardiovascular clinical characteristics be identified and outcome models be developed from an in-patient claims database?. Am J Cardiol. 1999, 84: 166-9. 10.1016/S0002-9149(99)00228-3.

    Article  CAS  PubMed  Google Scholar 

  15. Wilchesky M, Tamblyn R, A H: Validation of diagnostic codes within medical services claims. Journal of Clinical Epidemiology. 2004, 57 (2): 131-41. 10.1016/S0895-4356(03)00246-4.

    Article  PubMed  Google Scholar 

  16. Hemmelgarn B, Clement F, Manns B, Klarenbach S, James M, Ravani P, et al: Overview of the Alberta Kidney Disease Network. BMC Nephrology. 2009, 10 (30):

  17. Manns B, Mortis G, Taub K, McLaughlin K, Donaldson C, Ghali W: The Southern Alberta Renal Program database: a protoype for patient management and research initiatives. Clin Invest Med. 2001, 24 (4): 164-70.

    CAS  PubMed  Google Scholar 

  18. Statistics, Canada: Canadian Classification of Diagnostic, Therapeutic and Surgical Procedures. 1986, Statistics Canada Ottawa Canada

    Google Scholar 

  19. Lok C, Oliver M, Rothwell D, Hux J: The growing volume of diabetes-related dialysis: a population-based study. Nephrol Dial Transplant. 2004, 19: 3098-103. 10.1093/ndt/gfh540.

    Article  PubMed  Google Scholar 

  20. National, Institutes, of, Health, National, Institutes, of Diabetes and Digestive and Kidney Disease, Division, of Kidney, Urologic and Hematologic Diseases: United States Renal Data System: Researcher's Guide to using the USRDS Database. 2009, Bethesda, MD

    Google Scholar 

  21. Hux J, Ivis F, Flintoft V, Bica A: Diabetes in Ontario: Determination of prevalence and incidence using a validated administrative data algorithm. Diabetes Care. 2002, 25: 512-6. 10.2337/diacare.25.3.512.

    Article  PubMed  Google Scholar 

  22. Quan H, Li B, Saunders D, Parsons G, Nilsson C, Alibhai A, et al: Assessing validity of ICD-9-CM and ICD-10 administrative data in recording clinical conditions in a unique dually coded database. Health Services Research. 2008, 43 (4): 1424-41. 10.1111/j.1475-6773.2007.00822.x.

    Article  PubMed  PubMed Central  Google Scholar 

  23. Quan H, Sundararajan V, Halfon P, Fong A, Burnand B, Luthi J, et al: Coding algorithms for defining comorbidities in ICD-9CM and ICD-10 administrative data. Med Care. 2005, 43 (11): 1130-9. 10.1097/01.mlr.0000182534.19832.83.

    Article  PubMed  Google Scholar 

  24. Chen G, Faris P, Hemmelgarn B, Walker R, Quan H: Measuring agreement of administrative data with chart data using prevalence unadjusted and adjusted kappa. BMC Medical Research Methodology. 2009, 9 (5):

  25. Kundel H, Polansky M: Measurement of observer agreement. Radiology. 2003, 228: 303-8. 10.1148/radiol.2282011860.

    Article  PubMed  Google Scholar 

  26. Landis R, Koch G: The measurement of observer agreement for categorical data. Biometrics. 1977, 33: 159-74. 10.2307/2529310.

    Article  CAS  PubMed  Google Scholar 

  27. Alberta, Health and Wellness: Report on the health of Albertans. 2006, Edmonton: Alberta Health and Wellness

    Google Scholar 

  28. Kern E, Maney M, Miller D, Tseng C, Tiwari A, Rajan M, et al: Failure of ICD-9-CM codes to identify patients with comorbid kidney disease in diabetes. Health Services Research. 2006, 41 (2): 564-80. 10.1111/j.1475-6773.2005.00482.x.

    Article  PubMed  PubMed Central  Google Scholar 

  29. Quinn RRLA, Austin PC, Hux JE, Garg AX, Hemmelgarn BR, Oliver MJ: Using administrative datasets to study outcomes in dialysis patients: a validation study. Med Care. 2010, 48 (8): 745-50. 10.1097/MLR.0b013e3181e419fd.

    Article  PubMed  Google Scholar 

Pre-publication history

Download references


Drs. Tonelli, Hemmelgarn, and Manns were supported by New Investigator awards from the Canadian Institutes of Health Research. Dr James was supported by a KRESCENT (Kidney Foundation of Canada) and Alberta Heritage Foundation for Medical Research (AHFMR) Fellowship. Drs. Tonelli and Manns are supported by Alberta Innovates - Health Solutions Health Scholar Awards. Drs Klarenbach, and Hemmelgarn were supported by Population Health Investigator awards from Alberta Innovates - Health Solutions, and Dr. Klarenbach was supported by a Scholarship Award from the Kidney Foundation of Canada. Drs. Tonelli, Klarenbach, Hemmelgarn, Quinn, James and Manns were supported by an alternative funding plan from the Government of Alberta and the Universities of Alberta and Calgary.

Author information

Authors and Affiliations



Corresponding author

Correspondence to Brenda R Hemmelgarn.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

All the authors contributed to the conception and design. FC, MJ, RC and BR contributed to the data analysis and drafted the report. All of the authors contributed to the interpretation of data, critically revised the manuscript for important intellectual content and approved the final version submitted for publication.

Rights and permissions

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution License ( ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Clement, F.M., James, M.T., Chin, R. et al. Validation of a case definition to define chronic dialysis using outpatient administrative data. BMC Med Res Methodol 11, 25 (2011).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: