Skip to main content

Using quantile regression to investigate racial disparities in medication non-adherence



Many studies have investigated racial/ethnic disparities in medication non-adherence in patients with type 2 diabetes using common measures such as medication possession ratio (MPR) or gaps between refills. All these measures including MPR are quasi-continuous and bounded and their distribution is usually skewed. Analysis of such measures using traditional regression methods that model mean changes in the dependent variable may fail to provide a full picture about differential patterns in non-adherence between groups.


A retrospective cohort of 11,272 veterans with type 2 diabetes was assembled from Veterans Administration datasets from April 1996 to May 2006. The main outcome measure was MPR with quantile cutoffs Q1-Q4 taking values of 0.4, 0.6, 0.8 and 0.9. Quantile-regression (QReg) was used to model the association between MPR and race/ethnicity after adjusting for covariates. Comparison was made with commonly used ordinary-least-squares (OLS) and generalized linear mixed models (GLMM).


Quantile-regression showed that Non-Hispanic-Black (NHB) had statistically significantly lower MPR compared to Non-Hispanic-White (NHW) holding all other variables constant across all quantiles with estimates and p-values given as -3.4% (p = 0.11), -5.4% (p = 0.01), -3.1% (p = 0.001), and -2.00% (p = 0.001) for Q1 to Q4, respectively. Other racial/ethnic groups had lower adherence than NHW only in the lowest quantile (Q1) of about -6.3% (p = 0.003). In contrast, OLS and GLMM only showed differences in mean MPR between NHB and NHW while the mean MPR difference between other racial groups and NHW was not significant.


Quantile regression is recommended for analysis of data that are heterogeneous such that the tails and the central location of the conditional distributions vary differently with the covariates. QReg provides a comprehensive view of the relationships between independent and dependent variables (i.e. not just centrally but also in the tails of the conditional distribution of the dependent variable). Indeed, without performing QReg at different quantiles, an investigator would have no way of assessing whether a difference in these relationships might exist.

Peer Review reports


Diabetes is a chronic debilitating illness that affects approximately 24 million people in the United States [1]. Medication adherence is an important component of good diabetes care and medication non-adherence is associated with poor glycemic control [2, 3], increased health utilization [4, 5], increased health care costs [6, 7], and increased risk of death [5]. African Americans and other ethnic minority groups have higher prevalence of diabetes and are at increased risk for poor outcomes from diabetes [1]. Multiple recent studies have shown that ethnic minority groups with diabetes have poorer glycemic, lipid, and blood pressure control compared to Whites [8]. There are also data that suggest a correlation between ethnic differences in diabetes outcomes (e.g., glycemic, lipid, and blood pressure control) and ethnic differences in medication adherence [9]. Therefore, medication non-adherence is an important risk factor for poor diabetes outcomes, especially in ethnic minority groups.

Several methods exist to assess medication adherence including patient self-report, pill counts, physician/nurse report, pharmacy refill data, electronic monitoring, and biological assays [10]. The most commonly used methods use pharmacy refill data and provide reliable estimates of medication adherence [10]. Common methods for assessing medication non-adherence with pharmacy refill data include continuous measure of medication acquisition (CMA), continuous multiple intervals of oversupply (CMOS), medication possession ratio (MPR), and medication refill adherence (MRA), which have all been shown to be identical in terms of measuring adherence to prescription refills over a study period [11].

While the literature on ethnic/racial disparities on medication adherence is scant, some studies using pharmacy refill data from administrative databases have documented ethnic differences in medication adherence among individuals with diabetes [1214]. However, the magnitude of these racial/ethnic differences is unclear, especially across ranges of medication adherence (e.g. 40% vs. 60% vs. 80%). In addition, it is not clear if the findings of prior studies are reliable given some methodological weaknesses. For example, most prior studies used traditional regression methods that may not be valid if certain assumptions are not satisfied. Some studies used linear regression, which requires the residuals to be normally distributed and homoscedastic [5, 9]. Others have used logistic regression after categorization of the outcome [4, 12, 14], which could lead to arbitrary choice of categories such that results could be sensitive to choice of cutoff values. These methods also may not capture the effect of covariates on the entire distribution of the response variable.

While both linear and logistic regression focus on differences in means associated with covariates, quantile regression allows for studying different directions of the effects of a covariate on different parts of the distribution (lower and upper tails, middle part). Furthermore, quantile regression makes use of the full information of data in contrast to logistic regression, which is usually associated with a loss of information due to transformation of the response MPR into a categorical variable (e.g., binary variable with cutoff at 80%). More importantly, MPR is a quasi-continuous variable that takes on values that are bounded (i.e., have lower and/or upper bounds) and hence traditional methods that use mean changes of the dependent variable with changes in the independent variables may fail to discern differential patterns in non-adherence across racial/ethnic groups. Therefore, the aims of this study were twofold. First, was to examine racial differences in medication non-adherence using quantile regression. Second, was to demonstrate through empirical evidence how choice of a regression method (e.g., QReg, OLS or GLMM) could result in different conclusions for response variables like MPR, which usually have skewed distributions and take on bounded values. We hypothesized that QReg provides estimates of the effect of covariates on the conditional quantiles of MPR, leading to a more complete picture of the differences between race/ethnicity groups over the entire distribution of MPR including the tails and center of the conditional distribution.


We created a cohort of veterans with type 2 diabetes from a Veterans Administration (VA) facility in the Southeastern United States using multiple patient and administrative files from the Veterans Health Administration (VHA) Decision Support System (DSS) files linked by Social Security Number (SSN). The study period was from April 1996 to May 2006 with an average follow up period of 5.4 years. The datasets were merged, cleaned and then used as the final dataset for analysis. Veterans with type 2 diabetes were identified based on having at least two ICD-9 codes for diabetes (250.xx) in either outpatient or inpatient files and having two or more visits each year since diagnosis based on a previously validated algorithm [15]. The datasets were merged to create a subset that only included individuals with complete adherence data, resulting in a cohort of 11,272 veterans with type 2 diabetes, of which 5,307 were non-Hispanic White (NHW), 3,061 were non-Hispanic Black (NHB), 51 were Hispanic and 1,879 were identified as Other ethnic/racial group. There were also 974 (8.6%) with missing or unknown race/ethnicity information. The study was approved by our institutional review board (IRB) and local VA Research and Development committee.

Outcome Measures

The primary outcome was the mean medication possession ratio (MPR). MPR informs patient medication adherence by providing the ratio of the number of days of medication supplied within a refill interval to the number of days in a specified refill interval [16, 17]. We calculated the number of eligible days per medication within each 90-day refill period per patient. We considered supply of insulin and oral hypoglycemic agents (VA classes HS501 and HS502, respectively). The sum of eligible days served as the denominator for the MPR calculation [18]. The average MPR was calculated over the follow up period from 1996-2006. Prescriptions that became inactive during that time period did not contribute to the MPR calculation. We chose 90-day intervals because veterans typically have a 90-day of supply of medications mailed to their homes. If the MPR exceeded 100%, it was set to 100%.

Primary Covariate

The primary covariate of interest was race/ethnicity classified as NHW, NHB, and Other (including unknown and missing).

Demographic Variables

We controlled for three demographic variables in addition to the primary covariate. Age at baseline was treated as a continuous variable and centered at its mean value. Marital status was classified as never married, married (reference category), or separated/widowed/divorced. Employment was classified as employed, not employed (reference category), or retired.

Medical Comorbidity

Cancer, congestive heart failure (CHF), coronary heart disease (CHD), hypertension, and stroke were defined based on enhanced ICD-9 codes using validated algorithms [19] and coded as 0 or 1 based on presence or absence of history of the disease at baseline.

Psychiatric Comorbidity

Six psychiatric comorbidities including bipolar disorder, generalized anxiety disorder, major depressive disorder, post-traumatic stress disorder, psychotic disorders, and substance use disorder were defined as present (1) or absent (0) at baseline based on enhanced ICD-9 codes using validated algorithms [19].

Statistical analysis

First, we examined the characteristics of the sample through univariate analysis. This step was followed by pre-model building analysis, which included testing whether each covariate was individually associated with the outcome. To assess whether the relationship between age and MPR was non-linear, we examined the significance of a quadratic term for age. Next, a final model investigating the association between MPR and race/ethnicity was developed adjusting for all covariates such as demographics, medical comorbidities, and psychiatric comorbidities.

For quantile regression analysis, the response variable, MPR, was defined as the quantile of the mean medication possession ratio for each individual averaged over the study period. The specifications of the unconditional quantiles were made in two different ways: Scenario 1) the quantiles were specified based on clinically meaningful specific MPR cutoff values: Q1 = 0.40, Q2 = 0.60, Q3 = 0.80, Q4 = 0.90 where the values corresponded to the 2nd, 4th, 15th and 27th percentiles of the distribution of MPR and Scenario 2) the quantiles were based on the distribution of MPR values where the 5th, 10th, 15th, 25th and 50th percentiles were considered. These unconditional percentiles corresponded to MPR cutoff values of Q1 = 0.66, Q2 = 0.75, Q3 = 0.80, and Q4 = 0.88 and Q5 = 0.97, respectively.

Quantile regression is used to model the effects of covariates on the conditional quantiles of a response variable [20]. This approach is a robust method that makes no distributional assumption about the error term in a model. It is also robust to extreme points in the response space (outliers) but not to extreme points in the covariate space (leverage points). Confidence intervals for the estimated parameters in QReg are based on inversion of a rank test [21, 22].

Quantile Regression Model

For a random response variable Y with probability distribution function F(y) = Prob (Y ≤ y), the τth quantile of Y is defined as the inverse function Q(τ) = inf {y : F(y) ≥ τ} where 0 <τ < 1. Let X = (x1, ..., xn) denote the matrix consisting of n observed vectors of the random vector X, and let Y = (y1, ..., yn) denote the n observed responses. The model for linear quantile regression is given by y i = x i β τ + ε i , where β τ = (β 1τ , ..., β ) is the unknown p-dimensional vector of parameters and ε = (ε 1,..., ε n) is the n dimensional vector of unknown errors (Assumption: the τth quantile of ε i is zero). The β τ is a solution of,

The special case τ = 0.5 is equivalent to median regression. We used the finite smoothing algorithm [23, 24] to compute the solution of this equation so that the Newton-Raphson algorithm could be used iteratively to obtain the solution after a finite number of loops. The regression coefficient at a given quantile (β τ ) indicates the effect on Y of a unit change in X, assuming that the other factors are fixed.

Both unadjusted and covariate adjusted models were fitted with MPR as the response variable and race/ethnicity as primary variable of interest. Since our sample size is sufficiently large, the final model was adjusted for all covariates including demographic variables such as age, gender, marital status, employment status and medical and psychiatric comorbidities [25]. All models were assessed for goodness-of-fit using residual analysis. In addition, QReg was assessed using robust multivariate location and scale estimates for leverage point detection [26].

PROC QUANTREG in SAS 9.2 (SAS Institute Inc., Cary NC) was used to compute the regression models and to conduct statistical inferences on the estimated parameters. Verification for all QReg models was performed using the R [27] quantreg package.

Ordinary Least Squares (OLS)

SAS Proc GLM was used to estimate the parameters of a multiple regression model where the errors for different observations were assumed to be uncorrelated with identical variances (homoscedastic). Under these assumptions, OLS provides estimates of the linear parameters that are unbiased and have minimum variance among linear estimators. Residual plots were used to assess these assumptions but they did not hold true for our data.

Generalized linear Mixed Model (GLMM)

This model extends the above model by allowing a more flexible specification of the covariance matrix of the error terms. In other words, it allows for both correlation and heterogeneous variances, although requires normality assumption [28] which did not hold true for our data. SAS Proc GLIMMIX was used to estimate the parameters of a linear mixed model with a random intercept. This specification allowed different subjects to have different baseline MPR values. The same sets of covariates were used in OLS, GLMM and QReg.

Comparison of statistical methods (QReg, OLS, GLMM)

The second aim was addressed using empirical studies based on re-sampling of the data with replacement. Traditionally, Monte-Carlo simulation studies based on data generated from statistical models have been used for this kind of comparative study. Resampling has the advantage that the data in resampled datasets are based on observations from real patients [29] and thus reflect the appropriate level of diversity and variability found in realistic populations [30, 31]. Sampling with replacement was used since our dataset can be considered large to permit numerous samples of reasonable size to obtain stable conclusions within the smaller samples. Each dataset in the resampling study consisted of 5,000 patients, which represents many of the typical studies that use regional VA data. In order to robustly and accurately estimate the parameters, a total of 10,000 bootstrap replications were performed. The final estimates of the parameters and their standard errors were obtained using means and standard deviations of the 10,000 parameter estimates. Additionally, we computed exact percentiles (e.g., 97.5%; 2.5%) for constructing empirical confidence intervals.


Table 1 shows the socio-demographic characteristics for the 11,272 veterans with type 2 diabetes included in this sample. Approximately 97% were male with 47% being NHW and 27% NHB. The mean age was 66 years. The most prevalent medical comorbidities were hypertension (26%), CHD (14%) and CHF (8%). The most prevalent psychiatric comorbidities were substance use disorder (14%) and MDD (8%). During the study period the overall mortality was 16%. The mean HbA1c value was 7.0% (sd = 0.9%). Most Veterans (88.4%) had HbA1c values ≤ 8.0%. The mean (sd) MPR values for NHW, NHB and Others were 91.2% (0.2), 88.7% (0.3) and 90.7% (0.3), respectively. Figure 1, a density plot of MPR by race, shows the highly skewed nature of the distribution of MPR by race/ethnicity.

Table 1 Sample Characteristics by Race and Ethnicity (n = 11,272)
Figure 1

Distribution of Medication Possession Ratio (MPR) by Race/Ethnicity (Non-Hispanic White, Non-Hispanic Black, Other groups). dotted line = Other, dashed line = Non-Hispanic Black, solid line = Non-Hispanic White

We focus the description of quantile regression results on Scenario 1 since the results on Scenario 2 were qualitatively similar and also because most clinicians are interested in this scenario. In Figure 2, results comparing quantile regression with ordinary least square (OLS) regression are shown. While the curves across age for OLS are similar for all three race groups showing smaller racial/ethnic differences in mean MPR that decreased with age, the curves for QReg clearly indicate differences in MPR across race groups particularly in the lower quantiles of the MPR distribution. The differences are more pronounced in the three lower quantiles. The difference in MPR disappears with higher age in almost all the quantiles of medication adherence.

Figure 2

Distribution of Predicted Mean Medication Possession Ratio (MPR) by age for each type of model (Quantile Regression versus OLS). OLS = ordinary least squares. Qi = ith quantile (i = 1,.4): Quantiles are based on unconditional MPR cutoff values: 0.4, 0.6, 0.8 and 0.9.

In Table 2, the intercept in the first panel is interpreted as the estimated conditional quantile function of the MPR distribution of a type 2 diabetes patient who was female, NHW, married, unemployed, with no history of medical or psychiatric comorbidity and had an average age of the study population (age = 66 years, since age was centered at 66). In this adjusted QReg model, NHW had consistently higher MPR over all quantiles compared to NHB, and over quantiles 1 and 2 compared to Other (other racial groups). Compared to NHB, NHW had 3.4% (p < 0.11) higher MPR in the first quantile (Q1), 5.4% (p < 0.01) in the second quantile (Q2), 3.1% (p < 0.001) in the third quantile and 2.0% (p < 0.001) in the fourth quantile (Q4). Similarly, compared to Other race groups, NHW had 6.3% (p < 0.001) higher MPR in the first quantile (Q1) and 3.8% (p = 0.09) in the second quantile (Q2). The mean MPR values were also higher for NHW compared to NHB (1.4%, p < 0.001) as shown in the results for OLS and GLMM. However, the mean MPR difference between NHW and Other races was not significant (0.10%, p = 0.74).

Table 2 Adjusted parameter estimates (β) and p-values for quantile regression, ordinary least-squares regression, and the generalized linear mixed model

On the other hand, in the unadjusted model (see additional file 1, table S3), compared to NHB, NHW had 16.67% (p < 0.001) higher MPR in the first quantile (Q1), 9.47% (p < 0.001) in the second quantile (Q2), 4.76% (p < 0.001) in the third quantile and 2.63% (p < 0.001) in the fourth quantile (Q4). Similarly, compared to Other race groups, NHW had 16.67% (p < 0.001) higher MPR in the first quantile (Q1) and 8.087% (p = 0.004) in the second quantile (Q2). The mean MPR values were also higher for NHW compared to NHB (1.99%, p < 0.001) as shown in the results for OLS and GLMM. However, the mean MPR difference between NHW and Other races was not significant (0.103%, p = 0.769). Age showed a statistically significant quadratic relationship with MPR across all quantiles in the QReg as well as in the OLS and GLMM models. Divorced veterans had statistically significantly lower MPRs in quantiles 2, 3 and 4 while single veterans had lower MPRs in quantiles 2 and 4. Veterans who were employed had higher MPR compared to unemployed veterans (quantiles 1 and 3), while retired veterans had higher MPRs in quantiles 3 and 4 compared their unemployed counterparts. Veterans with a diagnosis of cancer had lower MPRs in the first two quantiles while veterans diagnosed with CHD had higher MPRs in these two quantiles and veterans with hypertension had lower MPRs in the highest quantile only compared to their counterparts without these comorbidities. Poor HbA1c control was positively associated with MPR in the first two quantiles (i.e., veterans in poor control had higher MPR in quantiles 1 and 2) but negatively associated with MPR in quantile 4 (i.e., veterans with poor control had lower MPR). Substance use disorder showed a statistically significant relationship with MPR in the lowest and the two highest quantiles but not in the second. In contrast, both OLS and GLMM did not show significant differences by gender, cancer or CHD, missing the significant differences in the lower tail of the distribution of MPR (Q1 or Q2). Table 3 shows the adjusted model from the bootstrap studies. The interpretation of the regression coefficients is similar to those in Table 2 except that these are values averaged over 10,000 bootstrapped datasets. These are computed to address concerns with regard to possible underestimation of the asymptotic standard errors (ASE) from QReg and to facilitate comparison among the different approaches. As expected, the bootstrap standard errors were larger than the ASEs but the conclusions were qualitatively similar (see Table 1). Across all quantiles except the lowest quantile (Q1), NHB had statistically significantly lower MPR in the 3rd and 4th quantiles compared to NHW holding all other variables constant. For example, in Q2 NHB had lower MPR compared to NHW with a difference of -4.5% (95% CI:-10.9%,1.7%). Similarly, the differences were 3.0% (-2.9%,-5.6%) and -1.9% (-3.3%,-0.54%) in Q3 and Q4, respectively.

Table 3 Mean parameter estimates (β) with corresponding 2.5% and 97.5% quantiles from a bootstrap study of 10,000 replications with sample size n = 5000

An additional set of analyses were performed using the second set of quantiles determined from the distribution of MPR or Scenario 2 (see Figure 3 and additional file 1, additional tables S1, S3a, and S4a). Overall, the results were qualitatively similar. Additional tables with bootstrapped based parameter estimates and corresponding 95% CI are reported (see additional file 1, tables S2, S3b, and S4b).

Figure 3

Distribution of Predicted Mean Medication Possession Ratio (MPR) by age for each type of model (Quantile Regression versus OLS). OLS = ordinary least squares. Qi = ith quantile (i = 1,.5): Quantiles are based on unconditional MPR cutoff values: 0.33, 0.48, 0.61 0.72 and 0.94.


The findings of this study show that the choice of regression methods in the study of non-normal, semi-continuous and bounded responses can influence whether disparities between different racial groups are uncovered. In this large cohort of Veterans with diabetes, differences in the lower tails of the distribution of MPR by race and comorbidities such as CHD may not have been discovered using OLS or GLMM methods, but were identified using quantile regression. While the regression coefficients of race in both, OLS and GLMM, only indicate the differences in mean MPR (i.e. covariate effect in the central portion of the MPR distribution), the most clinically relevant differences that were found in the tails of the distribution of MPR (those that are low or high in adherence) were only detected through testing of the significance of the regression coefficients in the lower and upper quantiles of the QReg model.

This study used a large cohort of veterans and appropriate statistical methodology permitting a more comprehensive assessment of differences in medication non-adherence by race/ethnicity. Ordinary least squares regression, logistic regression (after categorization) and general linear mixed models assume that covariates affect only the location of the conditional distribution of the response, and not its scale or any other aspect of its distributional shape, while quantile regression has the flexibility for modeling of data with heterogeneous conditional distributions. QReg provides a complete picture of the covariate effect when a set of percentiles is modeled, and thus offers the capability to capture important features of the data possibly missed by models that average over the conditional distribution. One other recent approach that might be able to capture the effect of covariates on the entire density of MPR is Bayesian density regression (BDR) [32, 33]. Like QReg, BDR avoids the assumption of normality and linearity. However, this approach is not as easy to understand and implement as QReg. Other approaches include Quasi-likelihood [32], Box-Cox transformation to normality [33] and robust regression [34, 35]. However, each of these methods has its own limitations [30].

Research on medication adherence patterns has consistently shown greater non-adherence to anti-hyperglycemic agents among NHB with type 2 diabetes compared to NHW [5, 9, 1214]. Consistent with prior studies, this study found that NHB were more likely to be medication non-adherent across each of the quantiles. Potential reasons for the difference in medication adherence by race/ethnicity group have been studied and seem to suggest that Blacks express more concern about drug side effects [36], medication dependency, reduced quality of life [37], and issues related to cost of medications [7, 36, 3840]. For example, among an insured cohort with pharmacy benefits, an increased patient cost share of $5/month led to a 15% decrease in the odds of medication adherence and worsened glycemic control [38]. However, in the VA system where cost of medications is less of an issue because copays are very low, other factors beyond cost of medications are likely to explain the observed differences. Potential explanatory factors that were not available in our dataset include patient-level factors such as health literacy, numeracy, self-efficacy, cultural beliefs and attitudes about medications, and social support. The contribution of these and other factors need to be explored in future studies.

Despite the strengths of our data and methodology, there were limitations that need mentioning. The dataset did not include information to determine the duration of diabetes as a way to distinguish between new and regular users of diabetes medication, thus, we were not able to assess its impact on medication adherence rates. However, we created a 'new users' group who did not use medication within the first year of the study and their proportions were not different from the overall sample proportion either by race or other demographic factors (see additional file 1, tables S5). Due to the age and gender distribution of our sample, our results should be interpreted with caution in women and younger aged individuals. In addition, our findings could have been biased by the 8.6% of veterans with missing race data. While we believe that the unreported race information is missing at random, we also performed a sensitivity analysis via multiple imputation and found that the results were not different from what is reported in this paper. While the conclusions are mainly applicable to skewed and bounded outcomes from cross sectional studies, the message is easily transferable to the analysis of longitudinal skewed and bounded outcomes via longitudinal quantile regression.


In conclusion, quantile regression allowed modeling the differential patterns in medication adherence between the racial/ethnic groups that would have been missed using traditional regression methods. QReg is a very useful tool for data that are heterogeneous in the sense that the tails and the central location of the conditional distributions vary differently with the covariates. Indeed, without performing quantile regression at different quantiles, an investigator would be unable to assess whether there might be a difference in these relationships. This method is also robust as it makes no distributional assumption about the error term in the model. Future studies need to be cautious when using traditional regression methods in modeling quasi-continuous and bounded outcome such as MPR.

Conflict of interests

The authors declare that they have no competing interests.



Hemoglobin A1c


Veterans Administration


Cardiovascular disease


Coronary heart Disease


Congestive Heart Failure


Major Depressive Disorder


International Classification of Diseases, Ninth Revision


Veterans Health Administration


Decision Support System


Social Security Number


Diagnostic Related Group


Institutional review board


Confidence Interval


Veterans Affairs Diabetes Trial


Medication Possession Ratio


Gap between refills


Continuous Measure Of Medication Acquisition


Continuous Multiple Interval Of Oversupply


Medication Refill Adherence


National Institute of Diabetes and Digestive and Kidney Diseases


Ordinary Least Squares


General Linear Mixed Model


Quantile Regression


Non Hispanic Black


Non Hispanic White


  1. 1.

    Centers for Disease Control and Prevention: National diabetes fact sheet: general information and national estimates on diabetes in the United States, 2007. 2008, Atlanta, GA: U.S. Department of Health and Human Services, Centers for Disease Control and Prevention

    Google Scholar 

  2. 2.

    Rozenfeld Y, Hunt JS, Plauschinat C, Wong KS: Oral antidiabetic medication adherence and glycemic control in managed care. Am J Manag Care. 2008, 14 (2): 71-75.

    PubMed  Google Scholar 

  3. 3.

    Pladevall M, Williams LK, Potts LA, Divine G, Xi H, Lafata JE: Clinical outcomes and adherence to medications measured by claims data in patients with diabetes. Diabetes Care. 2004, 27 (12): 2800-2805. 10.2337/diacare.27.12.2800.

    Article  PubMed  PubMed Central  Google Scholar 

  4. 4.

    Lau DT, Nau DP: Oral antihyperglycemic medication nonadherence and subsequent hospitalization among individuals with type 2 diabetes. Diabetes Care. 2004, 27 (9): 2149-2153. 10.2337/diacare.27.9.2149.

    CAS  Article  PubMed  Google Scholar 

  5. 5.

    Ho PM, Rumsfeld JS, Masoudi FA, et al: Effect of medication nonadherence on hospitalization and mortality among patients with diabetes mellitus. Arch Intern Med. 2006, 166 (17): 1836-1841. 10.1001/archinte.166.17.1836.

    Article  PubMed  Google Scholar 

  6. 6.

    Balkrishnan R, Rajagopalan R, Camacho FT, Huston SA, Murray FT, Anderson RT: Predictors of medication adherence and associated health care costs in an older population with type 2 diabetes mellitus: a longitudinal cohort study. Clin Ther. 2003, 25 (11): 2958-2971. 10.1016/S0149-2918(03)80347-8.

    Article  PubMed  Google Scholar 

  7. 7.

    Lee WC, Balu S, Cobden D, Joshi AV, Pashos CL: Prevalence and economic consequences of medication adherence in diabetes: a systematic literature review. Manag Care Interface. 2006, 19 (7): 31-41.

    PubMed  Google Scholar 

  8. 8.

    Kirk JK, D'Agostino RB, Bell RA, et al: Disparities in HbA1c levels between African-American and non-Hispanic white adults with diabetes: a meta-analysis. Diabetes Care. 2006, 29 (9): 2130-2136. 10.2337/dc05-1973.

    Article  PubMed  PubMed Central  Google Scholar 

  9. 9.

    Adams AS, Trinacty CM, Zhang F, et al: Medication adherence and racial differences in A1C control. Diabetes Care. 2008, 31 (5): 916-921. 10.2337/dc07-1924.

    Article  PubMed  PubMed Central  Google Scholar 

  10. 10.

    Farmer KC: Methods for measuring and monitoring medication regimen adherence in clinical trials and clinical practice. Clin Ther. 1999, 21 (6): 1074-1090. 10.1016/S0149-2918(99)80026-5. discussion 1073

    CAS  Article  PubMed  Google Scholar 

  11. 11.

    Hess LM, Raebel MA, Conner DA, Malone DC: Measurment of adherence in pharmacy administrative databases: a proposal for standard defnitions and preferred measures. Annals of Pharmacotherapy. 2006, 40: 1280-1287. 10.1345/aph.1H018.

    Article  PubMed  Google Scholar 

  12. 12.

    Shenolikar RA, Balkrishnan R, Camacho FT, Whitmire JT, Anderson RT: Race and medication adherence in Medicaid enrollees with type-2 diabetes. J Natl Med Assoc. 2006, 98 (7): 1071-1077.

    PubMed  PubMed Central  Google Scholar 

  13. 13.

    Hertz RP, Unger AN, Lustik MB: Adherence with pharmacotherapy for type 2 diabetes: a retrospective cohort study of adults with employer-sponsored health insurance. Clin Ther. 2005, 27 (7): 1064-1073. 10.1016/j.clinthera.2005.07.009.

    Article  PubMed  Google Scholar 

  14. 14.

    Yang Y, Thumula V, Pace PF, Banahan BF, Wilkin NE, Lobb WB: Predictors of medication nonadherence among patients with diabetes in Medicare Part D programs. A retrospective cohort study. Clin Ther. 2009, 31 (10): 2178-2188. 10.1016/j.clinthera.2009.10.002.

    Article  PubMed  Google Scholar 

  15. 15.

    Miller DR, Safford MM, Pogach LM: Who has diabetes? Best estimates of diabetes prevalence in the Department of Veterans Affairs based on computerized patient data. Diabetes Care. 2004, 27 (Suppl 2): B10-B21.

    Article  PubMed  Google Scholar 

  16. 16.

    Karve S, Cleves MA, Helm M, Hudson TJ, West DS, Martin BC: An empirical basis for standardizing adherence measures derived from administrative claims data among diabetic patients. Med Care. 2008, 46 (11): 1125-1133. 10.1097/MLR.0b013e31817924d2.

    Article  PubMed  Google Scholar 

  17. 17.

    Peterson AM, Nau DP, Cramer JA, Benner J, Gwadry-Sridhar F, Nichol M: A checklist for medication compliance and persistence studies using retrospective databases. Value Health. 2007, 10 (1): 3-12. 10.1111/j.1524-4733.2006.00139.x.

    Article  PubMed  Google Scholar 

  18. 18.

    Scott Leslie R, Gwadry-Sridhar F, Thiebaud P, Patel B: Calculating medication compliance, adherence and persistence in administrative pharmacy claims databases. Pharmaceutical Programming. 2008, 1: 13-19. 10.1179/175709208X334614.

    Article  Google Scholar 

  19. 19.

    Quan H, Sundararajan V, Halfon P, et al: Coding algorithms for defining comorbidities in ICD-9-CM and ICD-10 administrative data. Med Care. 2005, 43 (11): 1130-1139. 10.1097/01.mlr.0000182534.19832.83.

    Article  PubMed  Google Scholar 

  20. 20.

    Koenker RW: Quantile Regression. 2005, Cambridge Univ Press

    Google Scholar 

  21. 21.

    Koenker RW: Confidence intervals for regression quantiles. Asymptotic Statistics, Proceedings of the Fifth Prague Symposium. Edited by: Mandl P, Hu skov'a M. 1994, Springer, Heidelberg, 349-59.

    Google Scholar 

  22. 22.

    Hao L, Naiman DQ: Quantile Regression. 2007, Sage Publication Inc

    Google Scholar 

  23. 23.

    Chen C: An adaptive algorithm for quantile regression. Theory and applications of recent robust methods. Edited by: Hubert M, Pison G, Struyf A, Van Aelst S. 2004, Series: Statistics for Industry and Technology, Birkhauser, Basel, 39-48.

    Google Scholar 

  24. 24.

    Madsen K, Nielsen HB: A Finite Smoothing Algorithm for Linear Estimation. SIAM Journal on Optimization. 1993, 3: 223-235. 10.1137/0803010.

    Article  Google Scholar 

  25. 25.

    Harrell FE: Regression Modeling Strategies. 2001, New York: Springer

    Google Scholar 

  26. 26.

    Rousseeuw PJ, Van Driessen KA: Fast Algorithm for the Minimum Covariance Determinant Estimator. Technometrics. 1999, 41: 212-223.

    Article  Google Scholar 

  27. 27.

    Koenker R: quantreg: Quantile Regression. R package version 4.44. 2009, []

    Google Scholar 

  28. 28.

    Diggle P, Liang K, Zeger S: Analysis of longitudinal data. 2002, New York: Oxford University Press, 25: 2

    Google Scholar 

  29. 29.

    Rubin DB: Multiple Imputation for Nonresponse in Surveys. 2004, New York: John Wiley and Sons

    Google Scholar 

  30. 30.

    Royston P, Altman DG: Regression using fractional polynomials of continuous covariates: parsimonious parametric modelling. Journal of the Royal Statistical Society Series C-Applied Statistics. 1994, 43 (3): 429-467.

    Google Scholar 

  31. 31.

    Marshall A, Altman D, Holder R: Comparison of imputation methods for handling missing covariate data when fitting Cox-proportional hazards model: a resampling study. BMC Medical Research Methodology. 2010, 10 (1): 112-10.1186/1471-2288-10-112.

    Article  PubMed  PubMed Central  Google Scholar 

  32. 32.

    Dunson DB: Empirical Bayes density regression. Statistica Sinica. 2007, 17: 481-504.

    Google Scholar 

  33. 33.

    Dunson DB, Pillai NS, Park JH: Bayesian density regression. Journal of the Royal Statistical Society B. 2007, 69: 163-183. 10.1111/j.1467-9868.2007.00582.x.

    Article  Google Scholar 

  34. 34.

    McCullagh P, Nelder JA: Generalized Linear Models. 1989, London: Chapman and Hall, 2

    Google Scholar 

  35. 35.

    Box George EP, Cox DR: An analysis of transformations. Journal of the Royal Statistical Society, Series B. 1964, 26: 211-252.c.

    Google Scholar 

  36. 36.

    Holland P, Welsch R: Robust Regression Using Interactively Reweighted Least-Squares. Commun Statist Theor Meth. 1977, 6: 813-827. 10.1080/03610927708827533.

    Article  Google Scholar 

  37. 37.

    Chen C: Robust Regression and Outlier Detection with the ROBUSTREG Procedure. Proceedings of the Twenty-seventh Annual SAS Users Group International Conference. 2002, Cary, NC: SAS Institute Inc

    Google Scholar 

  38. 38.

    Aikens JE, Piette JD: Diabetic patients' medication underuse, illness outcomes, and beliefs about antihyperglycemic and antihypertensive treatments. Diabetes Care. 2009, 32 (1): 19-24. 10.2337/dc08-1533.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  39. 39.

    Huang ES, Brown SE, Thakur N, et al: Racial/ethnic differences in concerns about current and future medications among patients with type 2 diabetes. Diabetes Care. 2009, 32 (2): 311-316.

    Article  PubMed  PubMed Central  Google Scholar 

  40. 40.

    Kurlander JE, Kerr EA, Krein S, Heisler M, Piette JD: Cost-related nonadherence to medications among patients with diabetes and chronic pain: factors beyond finances. Diabetes Care. 2009, 32 (12): 2143-2148. 10.2337/dc09-1059.

    Article  PubMed  PubMed Central  Google Scholar 

Pre-publication history

  1. The pre-publication history for this paper can be accessed here:

Download references


This study was supported by Grant # REA 08-261, Center for Disease Prevention and Health Interventions for Diverse Populations funded by Veterans Affairs Health Services Research and Development (PI - Leonard Egede).

Author information



Corresponding author

Correspondence to Leonard E Egede.

Additional information

Authors' contributions

All authors read and approved the final manuscript. Study concept and design: LEE, MG; acquisition of data: LEE; analysis and interpretation of data: LEE, MG, MM, GG, CE, and YZ; drafting of the manuscript: MG, CPL, and MM; critical revision of the manuscript for important intellectual content: LEE, MG, CPL; study supervision: LEE.

Electronic supplementary material

Additional file 1: Table S1 Adjusted parameter estimates (β) and p-values for quantile regression, ordinary least-squares regression, and the generalized linear mixed model (scenario 2). Table S2. Adjusted parameter estimates (β) and bootstrapped 95% CI for quantile regression, ordinary least-squares regression, and generalized linear mixed model with corresponding 2.5% and 97.5% quantiles from a bootstrap study of 10,000 replications with sample size n = 5000. Table S3ab. S3a Title: Unadjusted parameter estimates (β) and p-values for quantile regression (QReg), ordinary least-squares regression (OLS), and generalized linear mixed model (GLMM) for the MPR data with sample size n = 11,272.. S3b Title: Unadjusted parameter estimates (β), and bootstrapped 95% CI for quantile regression (QReg), ordinary least-squares regression, and generalized linear mixed model with corresponding 2.5% and 97.5% quantiles from a bootstrap study of 10,000 replications with sample size n = 5000. Table S4ab. S4a Title: Unadjusted parameter estimates (β) and p-values for quantile regression (QReg), ordinary least-squares regression (OLS), and generalized linear mixed model (GLMM) for the MPR data with sample size n = 11,272.. S4b Title: Unadjusted parameter estimates (β), and bootstrapped 95% CI for quantile regression (QReg), ordinary least-squares regression, and generalized linear mixed model with corresponding 2.5% and 97.5% quantiles from a bootstrap study of 10,000 replications with sample size n = 5000. Table S5. Comparison of the proportion of new medication users by demographic variables with the overall proportion in the study sample (washout analysis) (DOCX 43 KB)

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Gebregziabher, M., Lynch, C.P., Mueller, M. et al. Using quantile regression to investigate racial disparities in medication non-adherence. BMC Med Res Methodol 11, 88 (2011).

Download citation


  • Medication adherence
  • Quantile regression
  • Diabetes
  • Health disparities