Skip to main content
  • Research article
  • Open access
  • Published:

Development and validation of algorithms to classify type 1 and 2 diabetes according to age at diagnosis using electronic health records

Abstract

Background

Validated algorithms to classify type 1 and 2 diabetes (T1D, T2D) are mostly limited to white pediatric populations. We conducted a large study in Hong Kong among children and adults with diabetes to develop and validate algorithms using electronic health records (EHRs) to classify diabetes type against clinical assessment as the reference standard, and to evaluate performance by age at diagnosis.

Methods

We included all people with diabetes (age at diagnosis 1.5–100 years during 2002–15) in the Hong Kong Diabetes Register and randomized them to derivation and validation cohorts. We developed candidate algorithms to identify diabetes types using encounter codes, prescriptions, and combinations of these criteria (“combination algorithms”). We identified 3 algorithms with the highest sensitivity, positive predictive value (PPV), and kappa coefficient, and evaluated performance by age at diagnosis in the validation cohort.

Results

There were 10,196 (T1D n = 60, T2D n = 10,136) and 5101 (T1D n = 43, T2D n = 5058) people in the derivation and validation cohorts (mean age at diagnosis 22.7, 55.9 years; 53.3, 43.9% female; for T1D and T2D respectively). Algorithms using codes or prescriptions classified T1D well for age at diagnosis < 20 years, but sensitivity and PPV dropped for older ages at diagnosis. Combination algorithms maximized sensitivity or PPV, but not both. The “high sensitivity for type 1” algorithm (ratio of type 1 to type 2 codes ≥ 4, or at least 1 insulin prescription within 90 days) had a sensitivity of 95.3% (95% confidence interval 84.2–99.4%; PPV 12.8%, 9.3–16.9%), while the “high PPV for type 1” algorithm (ratio of type 1 to type 2 codes ≥ 4, and multiple daily injections with no other glucose-lowering medication prescription) had a PPV of 100.0% (79.4–100.0%; sensitivity 37.2%, 23.0–53.3%), and the “optimized” algorithm (ratio of type 1 to type 2 codes ≥ 4, and at least 1 insulin prescription within 90 days) had a sensitivity of 65.1% (49.1–79.0%) and PPV of 75.7% (58.8–88.2%) across all ages. Accuracy of T2D classification was high for all algorithms.

Conclusions

Our validated set of algorithms accurately classifies T1D and T2D using EHRs for Hong Kong residents enrolled in a diabetes register. The choice of algorithm should be tailored to the unique requirements of each study question.

Peer Review reports

Background

Administrative health databases are an important resource for population-based diabetes research [1]. Using routinely-collected data such as billing codes and hospitalization records, various algorithms have been developed to identify diabetes [2, 3]. While these algorithms capture diabetes diagnoses, they cannot accurately identify diabetes type [2,3,4,5]. Type 1 diabetes (T1D) is an autoimmune disease that classically occurs in children, but may rarely occur in older adults [6]. In T1D, autoantibodies destroy the insulin-producing pancreatic beta cells, causing insulin deficiency and hyperglycemia. Type 2 diabetes (T2D), which typically occurs in adulthood, is caused by genetic and other risk factors such as obesity that lead to insulin resistance and hyperglycemia, although lean individuals may also develop T2D due to insulin deficiency [6]. While T1D must be treated with insulin, T2D may be treated with lifestyle modification, insulin, or other glucose-lowering medications [6].

Many epidemiological studies apply the untested assumption that findings in adults with diabetes are representative of T2D [7, 8]. However, the prognoses of T1D and T2D are markedly different [9]—especially among adults aged < 40 years, where both types commonly occur and may be difficult to distinguish clinically [1, 9]. In this age group, it has been shown that T2D is associated with a 15-fold elevation in the risk of cardiovascular complications versus T1D [9]. Yet, diabetes types are poorly documented in administrative databases, which were not originally designed for research purposes. Specific diagnostic codes for T1D and T2D may be erroneously entered [10] or unavailable in some billing systems [2]. Furthermore, classification of diabetes type is particularly important in Asia because disaggregated population-level T1D and T2D incidence and prevalence have never been measured [11].

Considering the lifelong and immediate need for insulin treatment in T1D, novel algorithms have been developed to identify T1D using prescriptions and laboratory data from electronic health records (EHRs) [12]. However, previous validation studies had small sample sizes and were mostly limited to children in white populations [13,14,15,16]. One study developed and validated a complex algorithm to detect T1D in a US population with 65% (36–100%) sensitivity and 88% (78–98%) positive predictive value (PPV) using EHRs [12]. However, algorithms developed for white populations may have a poorer PPV when applied to Asian populations, as the prevalence of T1D in Asians appears to be much lower than white people [17]. The proportion of diabetes cases classified as T1D and T2D also varies enormously by age at diagnosis; yet, the effect of age at diagnosis on the performance of classification algorithms has never been specifically studied. To address these gaps, we conducted a large study among Hong Kong residents with diabetes to develop and validate algorithms using EHRs to classify T1D and T2D against clinical assessment as the reference standard, and to evaluate performance by age at diagnosis.

Methods

Setting and data sources

Hong Kong is a special administrative region of China with a population of 7.3 million and an estimated diabetes prevalence of 10.3% (2014) [18]. All residents are entitled to universal inpatient and outpatient health services operated by the governmental Hong Kong Hospital Authority (HA), which is modeled after the National Health Service of Britain. Given the wide public-private healthcare cost differential, HA hospitals account for about 95% of all bed-days [19].

The Hong Kong Diabetes Surveillance Database (HKDSD) includes all Hong Kong residents with diabetes as identified using the HA’s territory-wide EHR, which includes routinely-collected data on laboratory tests, prescriptions, and hospital visits for the entire population. We defined diabetes onset as the first occurrence of glycated haemoglobin A1c ≥ 6.5% [20], fasting plasma glucose ≥ 7 mmol/L [21], glucose-lowering medication prescription [3, 4] excluding insulin, or long-term insulin prescription (≥ 28 days). To avoid detecting gestational diabetes [22], we excluded events occurring within 9 months prior to or 6 months after delivery (International Statistical Classification of Diseases and Related Health Problems version 9 (ICD-9) codes 72–75), or within 9 months of any pregnancy-related encounter (ICD-9 codes 630–676) outside these periods (in case of aborted pregnancies or delivery in a non-HA hospital). We also excluded in-patient glucose measurements to avoid misidentifying acute stress hyperglycemia as diabetes.

A subset of those in the HKDSD is additionally enrolled in the multicentre Hong Kong Diabetes Register (HKDR, Supplementary Table 1, Additional File). This register was established in 1995 at the Diabetes and Endocrine Centre at the Prince of Wales Hospital, a tertiary care public hospital in the New Territories East region with a catchment of 1.3 million residents, and was later expanded to 2 additional hospitals [23, 24]. Anyone with diabetes is eligible for enrolment in the HKDR. Referrals are self-initiated or from physicians located typically in community- or hospital-based clinics. All enrolled individuals undergo a comprehensive assessment including a detailed clinical history, fundoscopy and foot exams, and serum and urinary laboratory testing. This assessment yields detailed data including diabetes type, which is otherwise unavailable in the HKDSD. The research was approved by the Chinese University of Hong Kong–New Territories East Cluster Clinical Research Ethics Committee.

Study population

Because the reference standard (clinical assessment) was only established for the subset of those enrolled in the HKDR, we restricted the study to this sub-population. To ensure at least 1 year of follow-up data, we included all people with diabetes diagnosed at ages 1.5 (to exclude neonatal diabetes) to 100 years from 1 January 2002 through 31 December 2015, defined using the HKDSD criteria. The maximum follow-up date was 31 December 2016. We excluded individuals with monogenic or secondary diabetes and those with missing diabetes type in the HKDR (Fig. 1). We randomized the remaining individuals into the derivation (two thirds) and validation (one third) cohorts.

Fig. 1
figure 1

Flow diagram depicting creation of the study cohorts using the sub-population of people in the Hong Kong Diabetes Surveillance Database who were also enrolled in the Hong Kong Diabetes Register (HKDR). Diabetes type classification consisted of 2 steps: (1) comprehensive assessment, and (2) chart review of cases initially flagged as type 1 diabetes

Reference standard

C-peptide and autoantibody testing are not routinely available to confirm T1D diagnosis in the public setting, and self-funded tests are rarely performed. Therefore, we applied the standard clinical definition of T1D adopted by the HKDR [25], which strictly defines T1D as diabetic ketoacidosis, unprovoked heavy ketones in urine or requirement of insulin within the first year of diagnosis. An endocrinologist reviewed all charts initially marked as T1D in the HKDR to ensure accuracy.

Algorithm development and validation

We applied clinical knowledge (based on the experience of endocrinologists with expertise in diabetes management: CK, BRS, AL, JCNC) and reviewed previous validation studies [12,13,14,15,16, 26, 27] to develop candidate algorithms to identify T1D using either ICD-9 encounter codes (“code algorithms”; type 1 codes: 250.x1, 250.x3; type 2 codes: 250.x0, 250.x2) or prescriptions (“prescription algorithms”; Supplementary Tables 2–3, Additional File). We varied the number, ratio, and types of codes required, as well as the duration of time allowed between the diagnosis date and the initial insulin prescription. Positive cases were automatically classified as T1D and negative as T2D. Using the derivation cohort, we selected algorithms based on the sensitivity and PPV of identifying T1D, as these are the most important characteristics for public health [28]. Since the most sensitive algorithms had poor PPV and vice versa, we chose the best algorithms with the highest sensitivity and PPV separately, among both code and prescription algorithms (total: 4 algorithms, labelled A–D). We resolved ties by selecting the algorithm with the greatest sum of sensitivity and PPV. Then, we paired the 2 best code algorithms with the 2 best prescription algorithms using 2 methods in an effort to further improve accuracy [29, 30]. These methods were: combining using “or” (for example, “A or B”) to improve sensitivity, and combining using “and” (for example, “A and B”) to improve PPV. We then tested all 8 “combination algorithms” in the derivation cohort. Of the 12 code, prescription, and combination algorithms, we identified the 3 algorithms with the highest sensitivity, highest PPV, and highest kappa coefficient (“optimized” algorithm) across all ages. Using the validation cohort, we evaluated the performance of these 3 algorithms in classifying T1D and T2D by age at diagnosis.

We repeated the entire procedure using additional laboratory data (estimated glomerular filtration rate) to determine whether requiring normal renal function with insulin prescriptions would improve the performance of prescription algorithms.

Statistical analysis

We calculated the sensitivity, specificity, PPV, and negative predictive value (NPV) with 95% exact confidence intervals of each selected algorithm for classifying T1D and T2D in the derivation and validation cohorts. We also calculated Cohen’s kappa coefficient, which represents agreement after agreement due to chance is removed [31]. A perfect algorithm would have sensitivity, specificity, PPV, and NPV values of 100%, and a kappa value of 1.0. Missing data were minimal (missing diabetes type: n = 357, 2.3%) and handled by complete case analysis. All analyses were performed using the “FREQ” procedure in SAS version 9.4 (Cary, NC).

Results

There were 15,300 individuals with complete data and diabetes diagnosed during 2002–15 (Fig. 1). Of these cases, 121 were initially classified as T1D. After chart review, 3 were excluded as monogenic or secondary diabetes and 15 were re-classified as T2D, leaving 103 T1D cases remaining. The final cohorts consisted of 10,196 (derivation) and 5101 (validation) individuals. Tables 1 and 2 show the baseline demographic characteristics of the study cohorts. The distribution of baseline characteristics was highly similar across the derivation and validation cohorts and across the HKDR and HKDSD, although the HKDR population had more prescriptions for insulin and other glucose-lowering medications. The average age at diagnosis was 22.7 years for T1D and 55.9 years for T2D (Table 2; see Supplementary Figure 1, Additional File). More men (56.1%) had T2D, but for T1D the sex ratio was more balanced. People with T1D had a median of 3.0 type 1 codes, including 2.0 from the primary diagnosis on the hospital discharge abstract. People with T2D had a median of 1 type 2 code. Although most people with T1D had at least 1 type 1 code (83.3% sensitivity), the PPV for this algorithm was only 26.0%. Most people with T1D also had at least 1 type 2 code (70.0%). Code algorithms using a ratio of type 1 to type 2 codes had a higher PPV and similar sensitivity compared to those using the number of type 1 or type 2 codes. Two algorithms had the highest sensitivity (83.3%), but “ratio of type 1 to type 2 codes ≥ 0.5” (algorithm A) was chosen because it had a higher PPV (34.0%) than “at least 1 type 1 code.” “Ratio of type 1 to type 2 codes ≥ 4” (algorithm B) was chosen for having the highest PPV (57.3%, sensitivity 71.7%).

Table 1 Baseline characteristics of people in the Hong Kong Diabetes Register (HKDR, randomized 2:1 into derivation and validation cohorts) and the Hong Kong Diabetes Surveillance Database (HKDSD). Laboratory and prescription data are from the first year after diagnosis. Values are counts (n) and percentages unless otherwise indicated
Table 2 Baseline characteristics and performance of candidate algorithms among people in the derivation cohort, stratified by diabetes type. Candidate algorithms developed using encounter codes (“code algorithms”) or prescriptions (“prescription algorithms”) are also shown. For each algorithm, values in the Type 1 and 2 columns indicate the number and percentage of individuals satisfying the algorithm (sensitivity). Positive predictive values for classifying type 1 diabetes are shown in the right column. The best 4 algorithms are indicated by the letters in parentheses (A–D; see text for selection criteria)

Among the prescription algorithms, those specifying “at least 1 insulin prescription” were the most sensitive but lacked PPV for classifying T1D. Nearly everyone with T1D received an insulin prescription at any time (59 of 60 people, 98.3% sensitivity), and almost all received it within 90 days of diabetes diagnosis (58 of 59 people, 96.7% sensitivity). As these 2 prescription algorithms had the highest sensitivity values and classified everyone identically except for 1 case, we applied the tiebreaker criteria to choose “insulin prescription within 90 days” (algorithm C) based on its greater PPV (8.6%, versus 1.7% for “insulin prescription at any time”). Adding criteria for other types of medications improved the PPV of insulin-based prescription algorithms at the expense of sensitivity. In the T1D cohort, 36.7% received at least 1 metformin prescription (versus 88.6% in the T2D cohort), and 16.7% received a glucose-lowering medication prescription other than insulin and metformin (versus 75.7% in the T2D cohort). Of the algorithms that added a condition for no other glucose-lowering medication prescriptions in addition to an insulin prescription, the algorithm “at least 1 insulin prescription with no other glucose-lowering medication prescriptions except for metformin” had the highest PPV (31.0%; sensitivity 60.0%). Specifying the type of insulin as multiple daily injections further improved the PPV. “Multiple daily injections with no other glucose-lowering medication prescription” (algorithm D) had a 78.0% PPV (sensitivity 53.3%), which was the highest of the prescription algorithms.

Algorithms A–D classified T1D well for age at diagnosis < 20 years in the derivation cohort, but as the proportion of diabetes cases classified as T1D dropped with age, the precision and estimates of sensitivity and PPV also dropped (Fig. 2). For age at diagnosis < 20 years, algorithm B had the highest kappa coefficient (sensitivity: 91.3, 95% confidence interval 72.0–98.9%; PPV: 80.8%, 60.6–93.4%; Table 3). For age at diagnosis ≥ 20 years, algorithm C was the most sensitive but lacked PPV, while algorithm D had the highest PPV and kappa coefficient, despite a low sensitivity (age at diagnosis 20–39 years: sensitivity 50.0%, 29.9–70.1%, PPV 81.3, 54.4–96.0%; ≥ 40 years: sensitivity 27.3%, 6.0–61.0%, PPV 50.0%, 11.8–88.2%).

Fig. 2
figure 2

Sensitivity and positive predictive value of the 4 best single algorithms for classifying type 1 diabetes in the derivation cohort by age at diagnosis*, displayed with the proportion of all diabetes cases classified as type 1 using the reference standard (dashed line). Algorithms: (a) ratio of type 1 to type 2 codes ≥ 0.5; (b) ratio of type 1 to type 2 codes ≥ 4; (c) at least 1 insulin prescription within 90 days of diagnosis; (d) multiple daily injections with no other glucose-lowering medication prescriptions

*smoothed using 15-year moving averages

Table 3 Test characteristics of single (A–D) and combination algorithms for classifying type 1 diabetes compared to the reference standard in the derivation cohort, stratified by age at diagnosis. Sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) are percentages with 95% confidence intervals. Cohen’s kappa coefficient represents agreement after agreement due to chance is removed (1.0 indicates perfect agreement) [31]. The “Type 1 Proportion” columns refer to the percentage of people in the cohort with diabetes classified as having type 1 using each algorithm (“Calculated”) and the reference standard (“True”). The best overall algorithms are marked (* = highest sensitivity, = highest PPV, = highest kappa coefficient)

As with algorithms A–D, performance of the combination algorithms also generally dropped at older ages at diagnosis (Fig. 3). For ages at diagnosis < 20 years, 4 combinations had 100.0% (85.2–100.0%; Table 3) sensitivity; among these algorithms, combination “A and C” had the highest PPV (74.2%, 55.4–88.1%). Among adults aged ≥ 20 years, sensitivity and PPV differed depending on the type of combination. “And” combinations had the highest PPV. “A and D” had the highest PPV among adults (age at diagnosis 20–39 years: 90.9%, 58.7–99.8%; ≥ 40 years: 50.0%, 11.8–88.2%), but the sensitivity was low (age at diagnosis 20–39 years: 38.5%, 20.2–59.4%, ≥40 years: 27.3%, 6.0–61.0%). Combinations “A or C” and “B or C” had the highest sensitivity (100.0%, 86.8–100.0%), while “B or C” had a relatively higher PPV (age at diagnosis 20–39 years: 38.5, 22.8%, 15.5–31.6%, ≥ 40 years: 1.9%, 0.9–3.4%). Among the “or” combinations, “A or C” and “B or C” had the identically highest sensitivity for classifying T1D (age at diagnosis 20–39 years: 100.0%, 86.8–100.0%, ≥ 40 years: 90.9%, 58.7–99.8%). However, these algorithms had low PPV (age at diagnosis 20–39 years: 19.1–22.8%, ≥ 40 years: 1.8–1.9%).

Fig. 3
figure 3

Sensitivity and positive predictive value of the 8 combination algorithms for classifying type 1 diabetes in the derivation cohort by age at diagnosis.* We paired single algorithms using “and” to maximize positive predictive value (panels ad) and “or” to maximize sensitivity (panels eh). See Fig. 2 for algorithm descriptions

*smoothed using 15-year moving averages

Among the 12 algorithms we tested, “B or C,” “B and D,” and “B and C” had the best sensitivity (“high sensitivity for type 1” algorithm), PPV (“high PPV for type 1” algorithm), and kappa coefficient (“optimized” algorithm) respectively across all ages in the derivation cohort. Table 4 displays the performance characteristics of these algorithms in the validation cohort. The “high sensitivity for type 1” algorithm had a sensitivity of 95.3% (84.2–99.4%; PPV 12.8%, 9.3–16.9%), while the “high PPV for type 1” algorithm had a PPV of 100.0% (79.4–100.0%; sensitivity 37.2%, 23.0–53.3%) across all ages. The optimized algorithm had a sensitivity of 65.1% (49.1–79.0%) and PPV of 75.7% (58.8–88.2%) across all ages. These algorithms produced distinctive estimates of the proportion of cases classified as T1D among all diabetes cases according to age at diagnosis (Fig. 4). The high “PPV for type 1” algorithm yielded conservative estimates, while the “high sensitivity for type 1” algorithm inflated estimates. Estimates from “optimized” algorithm closely matched the reference standard across age at diagnosis.

Table 4 Test characteristics of the high sensitivity, high positive predictive value (PPV), and balanced algorithms for classifying type 1 diabetes compared to the reference standard in the validation cohort, stratified by age at diagnosis. Sensitivity, specificity, PPV and negative predictive value (NPV) are percentages with 95% confidence intervals. Cohen’s kappa coefficient represents agreement after agreement due to chance is removed (1.0 indicates perfect agreement) [31]. The “Type 1 Proportion” columns refer to the percentage of people in the cohort with diabetes classified as type 1 using each algorithm (“Calculated”) and the reference standard (“True”)
Fig. 4
figure 4

Proportion of all diabetes cases classified as type 1 by age at diagnosis in the validation cohort.* This proportion is calculated as the percentage of people in the cohort with diabetes classified as type 1 using the reference standard (dashed line), as well as high sensitivity for type 1, optimized, and high positive predictive value for type 1 algorithms (see Table 4 for descriptions)

*smoothed using 15-year moving averages

Modifying algorithms with renal function criteria resulted in similar PPV with the same or lower sensitivity, and ultimately did not improve performance (Supplementary Tables 4–6, Additional File). All selected algorithms had high sensitivity and PPV in classifying T2D across all ages at diagnosis (sensitivity range 93.5–100.0%, PPV range 99.7–100.0%, Supplementary Table 7, Additional File). As all cases were classified as T1D or T2D in a binary fashion, the “high sensitivity for type 1” algorithm was equivalent to a “high PPV for type 2” algorithm, while the “high PPV for type 1” algorithm was equivalent to a “high sensitivity for type 2” algorithm (Supplementary Table 8, Additional File).

Discussion

This is one of the largest validation studies of algorithms using EHRs to classify T1D and T2D among children and adults, and the only validation study in an Asian population. Using a systematic approach to generate a set of algorithms maximizing sensitivity and PPV, we revealed that classification performance is best at lower ages at diagnosis and drops as age at diagnosis increases—a finding that has not previously been demonstrated. We developed a “high sensitivity for type 1” algorithm (ratio of type 1 to type 2 codes ≥ 4, or at least 1 insulin prescription within 90 days) with > 90% sensitivity across age at diagnosis at the expense of lower PPV, and a “high PPV for type 1” algorithm (ratio of type 1 to type 2 codes ≥ 4, and multiple daily injections with no other glucose-lowering medication prescription) with perfect PPV across age at diagnosis at the expense of lower sensitivity. Our optimized algorithm (ratio of type 1 to type 2 codes ≥ 4, and at least 1 insulin prescription within 90 days) produced the most accurate estimates of the proportion of T1D cases across all ages at diagnosis. The complementary performance characteristics of these algorithms can inform their application to future studies, and the choice of algorithm should be tailored to the unique requirements of each study question.

Among children and adolescents, our diabetes classification algorithms performed similarly to others developed in white populations. Using Canadian administrative and prescription data, Vanderloo et al. [14] validated 4 algorithms using a combination of “Status Indian” registration, age < 10 years, and prescriptions to classify diabetes types. Although the sensitivity and PPV for classifying T1D were high (range: 96.9–99.2%), performance for identifying T2D was worse (sensitivity range: 55.4–84.2%; PPV range: 54.7–73.7%) and relied on ethnicity criteria that are not applicable in other populations. In a post-hoc analysis, we modified these algorithms by excluding inapplicable criteria and applied them to our data (Supplementary Tables 9–10, Additional File). These modified algorithms performed identically to our “high sensitivity for type 1” algorithm in classifying T1D (sensitivity 100.0%, 76.8–100.0%; PPV 70.0%, 45.7–88.1%) and T2D (sensitivity 77.8%, 57.7–91.4%; PPV 100.0%, 83.9–100.0%). In the large United States SEARCH for Diabetes in Youth Study (SEARCH), several algorithms were developed to identify diabetes type [13, 15, 16]. The “at least 1 outpatient T1D code” (sensitivity 94.8%, PPV 98.0% in SEARCH) [13] had 100.0% sensitivity (76.8–100.0%) and a better PPV (87.5%, 61.7–98.4%) than our “high sensitivity for type 1” algorithm. Other published SEARCH algorithms requiring the ratio of type 1 to total codes > 0.5 [15] and 0.6 [16] performed identically to our optimized algorithm (sensitivity 85.7–100.0%, PPV 87.5–100.0% for identifying T1D), although the latter algorithm required manual review to assess diabetes type for over a third of cases. The reasonable performance of these other algorithms confirms that T1D can be identified among children and adolescents using administrative and EHR data across different settings. Our results extend the literature with an expanded set of algorithms with optimal, maximally sensitive, or maximally predictive characteristics without the use of manual review, which would be unfeasible for large population-based studies.

By contrast, classification accuracy of the algorithms was lower among adults versus children. Previous validation studies including adults are limited. Klompas et al. [12] used a large EHR including primary and specialty care providers to develop and validate a complex algorithm (type 1 to type 2 codes > 0.5 and prescription for glucagon, type 1 to type 2 codes > 0.5 with no oral hypoglycemic other than metformin, C-peptide negative, autoantibodies positive, or prescription for urine acetone test strips) that reported a 65% (36–100%) sensitivity and 88% (78–98%) PPV for T1D and 100% (99–100%) sensitivity and 95% (88–100%) PPV for T2D. A modified version of this algorithm excluding urine acetone test strips was later tested separately [27]. However, these studies are limited by the lack of “and” combinations, and the use of a weighted sampling strategy that could have inflated estimates of PPV [12, 27]. Although algorithm performance in adults was not specifically reported, our post-hoc analysis showed that the algorithm proposed by Klompas et al. [12] (adapted to fit our data; see Supplementary Tables 9–10, Additional File) had decreased sensitivity (62.5%, 24.5–91.5%) and PPV (26.3%, 9.1–51.2%) among adults aged ≥ 40 years at diagnosis versus people aged < 20 years at diagnosis (sensitivity 100.0%, 76.8–100.0%, PPV 93.3%, 68.1–99.8%). The performance of another algorithm developed within a general practice EHR in the UK [26] showed a similar pattern using our data, although the overall performance was worse than our algorithms (sensitivity 39.5%, 25.0–55.6%; PPV 40.5%, 25.6–56.7% at all ages). While these results may be expected based on the rarity of T1D in adulthood, our large study adds a new approach to maximize sensitivity, PPV, or overall accuracy across all ages using different types of combinations. Moreover, we confirmed that renal function does not improve algorithm performance in adults, and this may reflect the growing variety of non-insulin agents available for people with diabetes and impaired renal function.

Our study yielded 3 complementary algorithms, the choice of which can be tailored to different study contexts depending on diabetes type, sensitivity, and PPV requirements. The optimized algorithm (ratio of type 1 to type 2 codes ≥ 4, and at least 1 insulin prescription within 90 days) performed highly accurately at ages at diagnosis < 20 years, but it also generated close estimates of the proportion of T1D among adults, as misclassified T1D and T2D cases were approximately balanced. Thus, the optimized algorithm could be applied to diabetes incidence and prevalence studies. Other algorithms may be better suited for cohort studies or other designs. For example, an adult-onset T1D cohort study could use the “high PPV for type 1” algorithm (ratio of type 1 to type 2 codes ≥ 4, and multiple daily injections with no other glucose-lowering medication prescription) to maximize PPV. Alternatively, a case-finding study designed to identify as many people with T1D as possible might apply the “high sensitivity for type 1” algorithm (ratio of type 1 to type 2 codes ≥ 4, or at least 1 insulin prescription within 90 days). A cohort study of T2D among adults could apply the “high PPV for type 2” (equivalent to “high sensitivity for type 1”) algorithm, although all 3 algorithms performed well considering the relatively high T2D prevalence in adults.

Our large register-based validation study is the first to specifically distinguish T1D and T2D in Asians, using routinely available encounter codes and prescriptions in a population-wide EHR within a public universal healthcare context. Unlike previous studies, we demonstrated the critical importance of age at diagnosis, defining separate derivation and validation cohorts to avoid overfitting. However, there are some limitations to note. As in other public healthcare settings, we did not have access to routine autoantibody or C-peptide testing to verify diagnoses of T1D. We could not include the entire HKDSD or externally validate because full chart access was only authorized for the HKDR. However, the HKDR represents a large geographic region of Hong Kong, which has a single publicly administered healthcare system serving its entire population. Although socioeconomic status variables were not captured in our databases, other baseline characteristics were highly similar between the HKDR and HKDSD, supporting the generalizability of our algorithms. Research platforms such as the HA’s Data Collaboration Lab should allow more comprehensive use of EHR data to improve diabetes classification using more complex methodologies and to enhance population research [32,33,34].

Conclusions

In summary, we developed and validated a set of algorithms to accurately classify diabetes type for different ages at diagnosis using population-level health data. As EHRs become increasingly available, our approach may be applied to generate similar algorithms in other settings. These algorithms can be applied to future studies to characterize incidence, prevalence, and other statistics separately for T1D and T2D—especially in China and other populations where these statistics have never been measured [11].

Availability of data and materials

The data set supporting the conclusions of this article is not publicly available. Interested researchers may apply for access through Shirley Au, for the Secretary of the Central Panel on Administrative Assessment of External Data Requests, Hospital Authority, Hong Kong Special Administrative Region (e-mail, hacpaaedr@ha.org.hk).

Abbreviations

A1C:

Glycated haemoglobin A1c

DPP-4:

Dipeptidyl peptidase-4

eGFR:

Estimated glomerular filtration rate

EHR:

Electronic health record

FN:

False negative

FP:

False positive

GLP-1:

Glucagon-like peptide-1

HA:

Hong Kong Hospital Authority

HDL-C:

High-density lipoprotein cholesterol

HKDR:

Hong Kong Diabetes Registry

HKDSD:

Hong Kong Diabetes Surveillance Database

ICD-9:

International Statistical Classification of Diseases and Related Health Problems version 9

IQR:

Interquartile range

LDL-C:

Low-density lipoprotein cholesterol

NPV:

Negative predictive value

PPV:

Positive predictive value

RAS:

Renin-angiotensin system

SEARCH:

SEARCH for Diabetes in Youth Study

SGLT2:

Sodium-glucose transport protein 2

T1D:

Type 1 diabetes

T2D:

Type 2 diabetes

TN:

True negative

TP:

True positive

References

  1. Ke C, Sohal P, Qian H, Quan H, Khan NA. Diabetes in the young: a population-based study of south Asian, Chinese and White people. Diabet Med. 2015;32:487–96.

    Article  CAS  Google Scholar 

  2. Hux JE, Ivis F, Flintoft V, Bica A. Diabetes in Ontario. Diabetes Care. 2002;25:512–6.

    Article  Google Scholar 

  3. Lipscombe LL, Hwee J, Webster L, Shah BR, Booth GL, Tu K. Identifying diabetes cases from administrative data: a population-based validation study. BMC Health Serv Res. 2018;18:316.

    Article  Google Scholar 

  4. Guttmann A, Nakhla M, Henderson M, To T, Daneman D, Cauch-Dudek K, et al. Validation of a health administrative data algorithm for assessing the epidemiology of diabetes in Canadian children. Pediatr Diabetes. 2010;11:122–8.

    Article  Google Scholar 

  5. Green A, Sortsø C, Jensen PB, Emneus M. Validation of the Danish National Diabetes Register. Clin Epidemiol. 2014;7:5–15.

    Article  Google Scholar 

  6. Holt RIG, Cockram C, Flyvbjerg A, Goldstein BJ. Textbook of diabetes. 5th ed. Chichester, West Sussex, UK ; Hoboken, NJ: Wiley-Blackwell; 2017.

    Book  Google Scholar 

  7. Ke C, Morgan S, Smolina K, Gasevic D, Qian H, Khan N. Mortality and cardiovascular risk of sulfonylureas in south Asian, Chinese and other Canadians with diabetes. Can J Diabetes. 2017;41:150–5.

    Article  Google Scholar 

  8. Ke CH, Morgan S, Smolina K, Gasevic D, Qian H, Khan NA. Is cardiovascular risk reduction therapy effective in south Asian, Chinese and other patients with diabetes? A population-based cohort study from Canada. BMJ Open. 2017;7:e013808.

    Article  Google Scholar 

  9. Luk AOY, Lau ESH, So W-Y, Ma RCW, Kong APS, Ozaki R, et al. Prospective study on the incidences of cardiovascular-renal complications in Chinese patients with young-onset type 1 and type 2 diabetes. Diabetes Care. 2014;37:149–57.

    Article  CAS  Google Scholar 

  10. Saydah S, Imperatore G. Emerging approaches in surveillance of type 1 diabetes. Curr Diab Rep. 2018;18:61.

    Article  Google Scholar 

  11. Wang L, Gao P, Zhang M, Huang Z, Zhang D, Deng Q, et al. Prevalence and ethnic pattern of diabetes and Prediabetes in China in 2013. JAMA. 2017;317:2515–23.

    Article  Google Scholar 

  12. Klompas M, Eggleston E, McVetta J, Lazarus R, Li L, Platt R. Automated detection and classification of type 1 versus type 2 diabetes using electronic health record data. Diabetes Care. 2013;36:914–21.

    Article  Google Scholar 

  13. Lawrence JM, Black MH, Zhang JL, Slezak JM, Takhar HS, Koebnick C, et al. Validation of pediatric diabetes case identification approaches for diagnosed cases by using information in the electronic health Records of a Large Integrated Managed Health Care Organization. Am J Epidemiol. 2014;179:27–38.

    Article  Google Scholar 

  14. Vanderloo SE, Johnson JA, Reimer K, McCrea P, Nuernberger K, Krueger H, et al. Validation of classification algorithms for childhood diabetes identified from administrative data. Pediatr Diabetes. 2012;13:229–34.

    Article  Google Scholar 

  15. Zhong VW, Pfaff ER, Beavers DP, Thomas J, Jaacks LM, Bowlby DA, et al. Use of administrative and electronic health record data for development of automated algorithms for childhood diabetes case ascertainment and type classification: the SEARCH for diabetes in youth study. Pediatr Diabetes. 2014;15:573–84.

    Article  Google Scholar 

  16. Zhong VW, Obeid JS, Craig JB, Pfaff ER, Thomas J, Jaacks LM, et al. An efficient approach for surveillance of childhood diabetes by type derived from electronic health record data: the SEARCH for diabetes in youth study. J Am Med Inform Assoc. 2016;23:1060–7.

    Article  Google Scholar 

  17. Park Y. Why is type 1 diabetes uncommon in Asia? Ann N Y Acad Sci. 2006;1079:31–40.

    Article  Google Scholar 

  18. Quan J, Li TK, Pang H, Choi CH, Siu SC, Tang SY, et al. Diabetes incidence and prevalence in Hong Kong, China during 2006–2014. Diabet Med. 2017;34:902–8.

    Article  CAS  Google Scholar 

  19. Leung GM, Tin KY, Chan W-S. Hong Kong’s health spending projections through 2033. Health Policy. 2007;81:93–101.

    Article  Google Scholar 

  20. World Health Organization. Use of glycated haemoglobin (HbA1c) in the diagnosis of diabetes mellitus. Diabetes Res Clin Pract. 2011;93:299–309.

  21. World Health Organization, International Diabetes Federation. Definition and diagnosis of diabetes mellitus and intermediate hyperglycaemia: report of a WHO/IDF consultation. 2006. http://www.who.int/diabetes/publications/diagnosis_diabetes2006/en/. Accessed 13 Dec 2019.

    Google Scholar 

  22. International Association of Diabetes and Pregnancy Study Groups Consensus Panel. International Association of Diabetes and Pregnancy Study Groups Recommendations on the diagnosis and classification of hyperglycemia in pregnancy. Diabetes Care. 2010;33:676–82.

    Article  Google Scholar 

  23. Piwernetz K, Home PD, Snorgaard O, Antsiferov M, Staehr-Johansen K, Krans M, et al. Monitoring the targets of the St Vincent declaration and the implementation of quality Management in Diabetes Care: the DiabCare initiative. Diabet Med. 1993;10:371–7.

    Article  CAS  Google Scholar 

  24. Chan JCN, So W, Ma RCW, Tong PCY, Wong R, Yang X. The complexity of vascular and non-vascular complications of diabetes: the Hong Kong diabetes registry. Curr Cardiovasc Risk Rep. 2011;5:230–9.

    Article  Google Scholar 

  25. Jones AG, Hattersley AT. The clinical utility of C-peptide measurement in the care of patients with diabetes. Diabet Med. 2013;30:803–17.

    Article  CAS  Google Scholar 

  26. Sharma M, Petersen I, Nazareth I, Coton SJ. An algorithm for identification and classification of individuals with type 1 and type 2 diabetes mellitus in a large primary care database. Clin Epidemiol. 2016;8:373–80.

    Article  Google Scholar 

  27. Schroeder EB, Donahoo WT, Goodrich GK, Raebel MA. Validation of an algorithm for identifying type 1 diabetes in adults based on electronic health record data. Pharmacoepidemiol Drug Saf. 2018;27:1053–9.

    Article  Google Scholar 

  28. German RR, Lee LM, Horan JM, Milstein RL, Pertowski CA, Waller MN, et al. Updated guidelines for evaluating public health surveillance systems: Recommendations from the Guidelines Working Group. Morb Mortal Wkly Rep. 2001;50(RR-13):1–35.

    CAS  Google Scholar 

  29. Marshall RJ. The predictive value of simple rules for combining two diagnostic tests. Biometrics. 1989;45:1213–22.

    Article  Google Scholar 

  30. Cebul RD, Hershey JC, Williams SV. Using multiple tests: series and parallel approaches. Clin Lab Med. 1982;2:871–90.

    Article  CAS  Google Scholar 

  31. Feuerman M, Miller AR. The kappa statistic as a function of sensitivity and specificity. Int J Math Educ Sci Technol. 2005;36:517–27.

    Article  Google Scholar 

  32. Kharrazi H, Weiner JP. IT-enabled Community Health Interventions: Challenges, Opportunities, and Future Directions. eGEMs. 2014;2. https://doi.org/10.13063/2327-9214.1117.

    Article  Google Scholar 

  33. Kharrazi H, Gonzalez CP, Lowe KB, Huerta TR, Ford EW. Forecasting the maturation of electronic health record functions among US hospitals: retrospective analysis and predictive model. J Med Internet Res. 2018;20. https://doi.org/10.2196/10458.

    Article  Google Scholar 

  34. Kharrazi H, Chi W, Chang H-Y, Richards T, Gallagher J, Knudson S, et al. Comparing population-based risk-stratification model performance using demographic, diagnosis and medication data extracted from outpatient electronic health records versus administrative claims. Med Care. 2017;55:789–96.

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank the Hong Kong Hospital Authority for providing data for this study.

Funding

CK is supported by the Canadian Institutes of Health Research Canada Graduate Scholarship and Michael Smith Foreign Study Supplements, the University of Toronto Clinician Investigator Program, the Canadian Society of Endocrinology and Metabolism Dr. Fernand Labrie Research Fellowship Grant, and the Royal College of Physicians and Surgeons of Canada Detweiler Traveling Fellowship. The funding bodies had no role in the design of the study, data collection, analysis, interpretation of data, and writing the manuscript.

Author information

Authors and Affiliations

Authors

Contributions

CK, AL, TAS, BRS, and JCNC conceived and designed the study. CK, AL, TAS, BRS, JCNC, and PJ analyzed and interpreted the data. CK drafted the article. CK, TAS, AL, BRS, PJ, EL, RCWM, WYS, APK, EC, and JCNC revised the article for important intellectual content and approved the final article.

Corresponding author

Correspondence to Calvin Ke.

Ethics declarations

Ethics approval and consent to participate

The research was approved by the Chinese University of Hong Kong–New Territories East Cluster Clinical Research Ethics Committee.

Consent for publication

Not applicable.

Competing interests

JCNC is the Chief Executive Officer, on a pro bono basis, of the Asia Diabetes Foundation (ADF) which is a non-profit research organization which designed and implemented the Joint Asia Diabetes Evaluation (JADE) Technology as an extension to the HKDR, under the governance of the CUHK Foundation.

The HKDR was established as a research-driven quality improvement program initiated by the Chinese University of Hong Kong (CUHK)-Prince of Wales Hospital Diabetes Care and Research Team, supported by the Hong Kong Foundation for Research and Development in Diabetes established at CUHK. In 2007, this was merged with the web-based JADE Technology, complete with care protocols, risk stratification, personalized reporting and decision support. The JADE Technology was designed and implemented by the ADF to enable other clinics and hospitals to establish diabetes registers and contribute anonymized data for research purposes. The ADF was set up as a charitable research organization governed by the CUHK Foundation.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Additional file 1.

Demographic characteristics of the New Territories East Cluster (NTEC) population and the overall Hong Kong population [2, 3]

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ke, C., Stukel, T.A., Luk, A. et al. Development and validation of algorithms to classify type 1 and 2 diabetes according to age at diagnosis using electronic health records. BMC Med Res Methodol 20, 35 (2020). https://doi.org/10.1186/s12874-020-00921-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12874-020-00921-3

Keywords