 Research article
 Open Access
 Published:
A semiparametric mixed models for longitudinally measured fasting blood sugar level of adult diabetic patients
BMC Medical Research Methodology volume 19, Article number: 13 (2019)
Abstract
Background
At the diabetic clinic of Jimma University Specialized Hospital, health professionals provide regular followup to help people with diabetes live long and relatively healthy lives. Based on patient condition, they also provide interventions in the form of counselling to promote a healthy diet and physical activity and prescribing medicines. The main purpose of this study is to estimate the rate of change of fasting blood sugar (FBS) profile experienced by patients over time. The change may help to assess the effectiveness of interventions taken by the clinic to regulate FBS level, where rates of change close to zero over time may indicate the interventions are good regulating the level.
Methods
In the analysis of longitudinal data, the mean profile is often estimated by parametric linear mixed effects model. However, the individual and mean profile plots of FBS level for diabetic patients are nonlinear and imposing parametric models may be too restrictive and yield unsatisfactory results. We propose a semiparametric mixed model, in particular using spline smoothing to efficiently analyze a longitudinal measured fasting blood sugar level of adult diabetic patients accounting for correlation between observations through random effects.
Results
The semiparametric mixed models had better fit than the linear mixed models for various variance structures of subjectspecific random effects. The study revealed that the rate of change in FBS level in diabetic patients, due to the clinic interventions, does not continue as a steady pace but changes with time and weight of patients.
Conclusions
The proposed method can help a physician in clinical monitoring of diabetic patients and to assess the effect of intervention packages, such as healthy diet, physical activity and prescribed medicines, because individualized curve may be obtained to follow patientspecific FBS level trends.
Background
Diabetes mellitus is a metabolic disorder of multiple etiology characterized by chronic hyperglycaemia with disturbances of carbohydrate, fat and protein metabolism resulting from defects in insulin secretion, insulin resistance, or both [1]. The longterm effects of untreated diabetes mellitus might results in health complications, such as visual disability and nerve disease [2–5], among others. A person is considered to be diabetic if he or she has fasting blood sugar (FBS) level value of greater than or equal to 7.0 mmol/L (126 mg/dL) or 2h blood sugar level of greater than or equal to 11.1 mmol/L (200 mg/dL) or glycated hemoglobin (HbA_{1}) level of 6.5% or higher [6].
There are three main types of diabetes, namely type 1 diabetes, type 2 diabetes and gestational diabetes. The type 1 diabetes is caused by an autoimmune reaction, in which the patient body defense system attacks the insulin producing beta cells in the pancreas and hence the body can no longer produce the insulin it needs. Whereas in type 2 diabetes, the body is able to produce insulin, however it becomes resistant so that the insulin is ineffective. The type 2 diabetes is characterized by high levels of blood sugar or glucose resulting from defects in insulin production, insulin action, or both. The gestational diabetes is a form of diabetes that appears during pregnancy. It can lead to serious health risks for both the mother and child [7]. The risk factors that are associated with type 1 diabetes include family history of diabetes (diabetes history in one parent or both), infections and other environmental influences such as exposure to a viral illness, the presence of damaging immune system cells, i.e. autoantibodies and dietary factors low vitamin D consumption [8]. Whereas, for type 2 diabetes the risk factors are excess body weight, physical inactivity, poor nutrition, family history of diabetes, past history of gestational diabetes and older age [9]. The risk factors for increase or decrease in fasting blood sugar level of a patient include overweight, family history of diabetes, age, type of diabetes, blood pressure and gender [7]. The focus of this study however is on type 1 and type 2 diabetes.
In year 2015, there were an estimated 415 million adults aged 20–79 years living with diabetes worldwide [10], including 193 million who are undiagnosed. There were approximately 5 million people estimated to have died from diabetes worldwide in the same year, and a majority of these were the result of cardiovascular complications. In Africa Region, the number of adults living with diabetes estimated at 14.2 million whereas in Ethiopia the number is estimated 1 to 10 million in year 2015. The Region has the highest proportion of undiagnosed diabetes, 9.5 million (about 66.7%) of people with diabetes are unaware they have the disease and in Ethiopia there are 500 thousand to 5 million such cases [11, 12].
At the diabetic clinic of Jimma University Specialized Hospital (JUSH), health professionals provide regular followup to help people with diabetes live long and relatively healthy lives. Depending on patients conditions, e.g. FBS level, they also provide interventions in the form of counselling to promote a healthy diet and physical activity and prescribing medicines.
The main objective of the current study is to assess the factors that affect the FBS level of adult diabetic patients. In addition to assessing the factors that affect the FBS level over time, we are also interested to estimate the rate of change of FBS profile experienced by patients over time. The change may help to assess the effectiveness of interventions taken by the clinic to regulate FBS level, where rates of change close to zero over time may indicate the interventions are good regulating the level. These changes are determined using first derivatives of penalized regression splines [13, 14].
The FBS level data of diabetic patients in this study are collected repeatedly over time hence the data have longitudinal time series profiles and the data also have continuous nature. For statistical inferences, therefore, it is necessary to capture properly the form of the evolution of profiles over time. In the analysis of longitudinal data, the mean profile is often estimated by parametric linear mixed effects model, for instance recently Mehari [15] analyzed the FBS level profiles of diabetic patients using parametric linear mixed effects model. However, the individual and mean profile plots of FBS level for diabetic patients (see Fig. 1) are nonlinear and imposing parametric models may be too restrictive and yield unsatisfactory results. In the present paper, we propose a semiparametric mixed model in particular using spline smoothing [16, 17] to efficiently analyze a longitudinal measured fasting blood sugar level of adult diabetic patients accounting for correlation between observations through random effects. The model assumes that the mean of FBS level is an arbitrary smooth function of time and parametric functions of other covariates. The link between mixed model and smoothing provides a flexible framework for estimating the patient profiles in a data driven way [13].
The rest of the paper is organized as follows. The data, some basic review of variancecovariance structure of the parametric linear mixed model, semiparametric mixed models and inferences related them are introduced in “Methodology” section. The results from applying these methods on the the study data are discussed in “Results” section. Finally discussion, and conclusions and pointers for future study are given in “Discussion” and “Conclusion” sections respectively.
Methodology
Study data
The fasting blood sugar (FBS) level data used in this paper arises from a retrospective study conducted in Jimma University Specialized Hospital (JUSH) diabetic clinic. The hospital is located in Jimma town 352 km to the Southwest of Addis Ababa, the capital of Ethiopia. It is a teaching hospital and gives service to the southwestern part of Oromia region, some part of southern nations and nationalities and Gamella regions of Ethiopia. All diabetic patients aged 18 years or older, who were coming to JUSH diabetic clinic for their regular follow up during periods September 2011 and June 2014 were eligible for this study. During their follow up, patients FBS level along with other characteristics such as weight are measured and recorded in the individual follow up chart. The data in the chart include time (measured in months, where baseline or initial date was given a value 0), patient gender, age, type of diabetes (Type 1 diabetes or Type 2 diabetes) and family diabetes history. The duration between initial and the last recorded visits ranged from one to 36 months. Patients with at least two observations were included in the analyses leading to a total of 534 patients and 4390 observations. Permission of the study was obtained from Postgraduate research office of Jimma University, College of Natural Sciences and JUSH.
Variancecovariance structures and inference
Variancecovariance structures
The FBS level data of this study fall within the framework of continuous longitudinal data and hence can be modeled by use of a parametric linear mixed model. Let Y_{ij} denote the FBS level of the ith patient observed at time t_{ij}, i=1,…,n and j=1,…m_{i}. The parametric linear mixed model can be expressed as
That is, the population level mean response is modeled as a polynomial function of time, t_{ij}, a linear function of covariates x_{ijl}, l=1,…,p where some of them may be timevarying covariates or interaction effects each has corresponding regression parameter coefficient θ_{l}, a function of subjectspecific random coefficient terms and measurement error ε_{ij}. The coefficients β_{k}, k=1,…,p and θ_{l}, l=1,…,L are fixed effect parameters and \(b_{u_{i}}\), u=0,…,q are subjectspecific random coefficients. It is assumed that \(b_{u_{i}} \sim \mathcal {N}\left (0, \sigma _{b_{u}}^{2}\right)\), \(\varepsilon _{ij} \sim \mathcal {N}\left (0, \sigma _{e}^{2}\right)\), \(cov\left (b_{u_{i}}, b_{u_{i'}}\right) = \sigma _{b_{u} b_{u}'}\) and \(cov\left (b_{u_{i}}, \varepsilon _{ij}\right) = 0\). We have examined models for p=2 which represents quadratic polynomial and \(b_{u_{i}}\) with u=0,1,2 represent a subjectspecific random intercept, slope and quadratic coefficients, respectively for selection of a variancecovariance structure (see Table 1). The variance profile plot of FBS level shows (for the sake of brevity this plot is not reported) the variance changes overtime, therefore to allow for more flexibility to estimate between subject variability we have considered the above three variancecovariance structures.
In Table 1, for instance the subjectspecific random intercept \(b_{0_{i}}\) in the quadratic random effects model (M_{3}) is considered to capture correlation of the FBS level measurements over time within the patient and it is assumed that subjectspecific random slopes for linear as well as for quadratic time effects to capture different evolution of FBS level over time. Note that these subjectspecific random structures are different for each patient.
Tests for zero variance components
Adequate variancecovariance structure is essential to obtain valid model based inferences for the fixed effects or for parameters in the mean structure of the model [18]. Overparametrization of the variancecovariance structure leads to inefficient estimation and potentially poor assessment of standard errors for the estimation of the mean structure, i.e. fixed effects, whereas a too restrictive specification invalidates inferences about the mean response profile when the assumed structure does not hold.
The likelihood ratio test for testing, for example \(H_{0}: \sigma ^{2}_{b_{0}} = 0\) versus \(H_{1}: \sigma ^{2}_{b_{0}} > 0\) for model M_{1}, has an asymptotic \(0.5\,\chi ^{2}_{0} + 0.5\,\chi ^{2}_{1}\) mixture distribution under H_{0} [19], if the vector of FBS level can be divided into a large number of independent and identically distributed subvectors both under H_{0} and H_{1}. However, this assumption usually does not hold, for example in linear mixed models or for unbalanced data [20–22]. Note that the FBS level data are unbalanced in the sense that all patients do not have equal number of measurements, hence the independent and identically distributed assumption can be violated in the linear mixed models used in this paper. Therefore, we used the exact finite sample null distribution of the restricted likelihood ratio test (RLRT) statistic derived by Crainiceanu and Ruppert [22] to test a zero random effect variance in M_{1}. However, since models M_{2} and M_{3} contain more than one random effect, the tests for a zero random effect variance in these models were done using the exact finite sample null distribution of the RLRT statistic derived by Greven et al. [21].
Semiparametric mixed effects model
Given the mean profile plots over time in Fig. 1b, imposing parametric functions to describe the mean FBS level evolution may not be easy and also too restrictive [17]. As an alternative, we can model the mean profiles over time with a semiparametric smooth function, f(t_{ij}). Using the pth degree truncated power basis, f(t_{ij}) can be written as
here z_{+}=max{0,z}. The function f(t_{ij}) is a combination of fixed effects parameters β_{0},β_{1},…,β_{p} and pth degree splines evaluated at time t_{ij} with knots at distinct locations κ_{1},κ_{2},…,κ_{K} in the range of t_{ij} and corresponding coefficients b_{1},b_{2},…,b_{K}. The function f(t_{ij}) can be estimated among others, with penalized splines. The coefficients of spline basis functions b_{l} are assumed to follow a Gaussian distribution such that \(b_{l} \sim \mathcal {N}\left (0, \sigma ^{2}_{b}\right)\), where \(\sigma ^{2}_{b}\) is a variance component controlling the smoothness of f(t_{ij}). Then, incorporating f(t_{ij}) in model (1), the general semiparametric mixed effects model can be expressed as
Estimation of parameters
Let \(\mathbf {y}_{i} = \left (y_{i1}, y_{i2}, \ldots, y_{{im}_{i}}\right)'\) be the m_{i}×1 vector of responses for the ith patient, i=1,…,n. Under the linear mixed model formulation, model (3) with subjectspecific quadratic random effects can be expressed succinctly in matrix form as
where β=(β_{0},β_{1},…,β_{p},θ_{1},…,θ_{L})^{′} is a (p+L+1)×1 vector of fixed effects which is common to the n individuals, X_{i} is an m_{i}×(p+L+1) design matrix associating β to y_{i}, v=(b_{1},b_{2},…,b_{K}) is a Kdimensional vector of random coefficients in the summand in Eq. (2), Z_{i(f)} is the m_{i}×K matrix for the pthdegree spline basis functions, \(\textbf {u}_{i} = \left (b_{0_{i}}, b_{1_{i}}, b_{2_{i}}\right)'\) is subjectspecific vector of random effects, Z_{i(u)} is an m_{i}×3 design matrix which relates u_{i} to the response y_{i} and \(\textbf {e}_{i} = \left (e_{1i}, e_{2i}, \ldots, e_{{im}_{i}}\right)'\) is an m_{i}dimensional vector of withinindividual errors. Furthermore, it is assumed that \(\textbf {v} \sim \mathcal {N}\left (\textbf {0}, \sigma _{b}^{2}\,\textbf {I}_{K}\right)\), \(\textbf {u}_{i} \sim \mathcal {N}(\textbf {0}, \textbf {G})\), \(\textbf {e}_{i} \sim \mathcal {N}\left (\textbf {0}, \textbf {R}_{i}\right)\), v, u_{i} and e_{i} are assumed to be pairwise independent with and between subjects for i=1,2,…,n. Note that G and R_{i} are 3×3 and m_{i}×m_{i} variancecovariance matrices, respectively.
The overall model for n individuals has the form
where
and \(\phantom {\dot {i}\!}\textbf {b} = (b_{1}, b_{2}, \ldots, b_{k}, b_{0_{1}}, b_{1_{1}}, b_{2_{1}}, \ldots, b_{0_{n}}, b_{1_{n}}, b_{2_{n}})'\). Estimation of the coefficients of penalized and unpenalized terms in model (4) was done using a penalized iteratively reweighted least squares (PIRLS) based on 20 equidistant knots in the range of FBS level and a smoothing parameter selection was done by REML [23].
The correspondence between the penalized spline smoother and the optimal predictor in a mixed model framework enables us to take advantage of the existing methodology for mixed model analysis and the use of mixed model software, such as the function gamm in mgcv R package, for fitting the penalized spline model and the MIXED and GLIMMIX procedures in SAS [24]. This implementation of penalized smoothing in the linear mixed model framework also provides an automated approach to obtain a smoothing parameter and flexibility to extend the models [17].
In this paper, parameters in the fitted models are estimated by restricted maximum likelihood (REML) method because the statistical hypotheses that were considered have the same mean structures between models under the null and alternative hypotheses. Furthermore, maximum likelihood estimators of variance components are biased downward as they do not take into account the degrees of freedom lost in the estimation of fixed effects (e.g. see Ruppert et al. [16]).
Model selection and inference
The model building process of this work includes selection of suitable variancecovariance structure for random effects, testing whether the inclusion of spline effects in the parametric model improves model fit or not and also selection of covariates. The linear mixed model framework provides a unified approach to do all these [25]. In the parametric cases, the best fitting model can be selected by employing a commonly used selection criteria, Akaike’s Information Criterion (AIC) and Bayesian Information Criterion (BIC) or by a likelihood ratio test. However, since the semiparametric mixed models that we considered here are differ in both the fixed effects and the nonparametric part, model selection is done via adjusted Akaike’s information criterion, abbreviated AIC _{adj}, using the effective number of parameters in the model [16, 26]. Let C= [X Z_{f}] be the design matrix with appropriate fixed effects components and the corresponding smoothing matrix, \(\textbf {B} = \left (\begin {array}{cc} \textbf {0} & \textbf {0} \\ \textbf {0} & \textbf {G}^{1} \end {array} \right)\)where G is the variancecovariance matrix of random effects used in the model and R=diag{R_{1},R_{2},…,R_{n}}, i.e. R is the block diagonal variancecovariance matrix of error terms with blocks R_{i} on the main diagonal and zeros elsewhere. Then the effective number of parameters and AIC _{adj} may be computed as
and AIC _{adj}=−2 log(Lik)+2 E_{p}, respectively. Unlike the marginal AIC which penalizes only for the number of parameters in fixed effects vector and variance components, the penalty of AIC _{adj} takes into account the additional parameters introduced into a model via f(t_{ij}) or smoothing by including the design matrix Z_{f} in C [17]. Like the marginal AIC, the smaller the AIC _{adj} value, the better the model.
Testing whether the inclusion of spline effects in the parametric model improves model fit or not is equivalent to testing \(H_{0}: \sigma ^{2}_{b} = 0\) versus \(H_{1}: \sigma ^{2}_{b} > 0\). In this paper, due to the second objective of the study, a quadratic penalized spline was added in Eq. (1), therefore neither of the two methods discussed in “Variancecovariance structures and inference” section can be used to test \(H_{0}: \sigma ^{2}_{b} = 0\) [27] instead an approximate Ftest of Hastie and Tibshi [28] was applied. For Hastie and Tibshi approximate Ftest, residual degrees of freedom for the null and alternative model fits are used in the place of the number of parameters in each model.
Rate of change over time and simultaneous confidence bands
The change in smoothing function f(t) overtime, for selected semiparametric mixed model, can be estimated by taking the derivative of f(t) with respect to time t. For example, let f(t) be a quadratic penalized spline, that is
Taking the first derivative with respect to time t yields
An estimate of f^{′}(t), denoted \(\hat {f}'(t)\), is obtained by substituting the quadratic fit parameter estimates \(\hat {\beta }_{1}, \hat {\beta }_{2}\), and \(\hat {b}_{1}, \hat {b}_{2}, \ldots, \hat {b}_{K}\). However, the construction of simultaneous confidence bands requires the variancecovariance matrix for the vector of contrasts between the estimated and true parameters for the fixed and random effects. Let C= [X Z_{f}] be a design matrix containing quadratic time effects and a truncated quadratic basis, B is a matrix constructed from variance components corresponding to smoothing, i.e. Var(v) in model (4). Then, a variancecovariance matrix for the vector of contrasts is given by
Ruppert et al. [16], where R is the block diagonal variancecovariance matrix of error terms defined in “Semiparametric mixed effects model” section. Let g=(g_{1},g_{2},…,g_{T}) be a grid of equally spaced time points. Define
where C_{g} is C with design matrices X and Z_{f} are evaluated over g. Assuming the vector of contrasts have approximately multivariate distribution with mean vector 0 and variancecovariance matrix (C^{′}R^{−1} C+B)^{−1} [16, 29], i.e.
a 100 (1−α)% simultaneous confidence bands for f_{g} is given by
where \(\textbf {s}_{g} \,=\, \left (\widehat {SD}\left (\hat {f}_{g_{1}} \,\, f_{g_{1}}\right), \widehat {SD}(\hat {f}_{g_{2}} \,\, f_{g_{2}}), \ldots, \widehat {SD}\left (\hat {f}_{g_{T}} \,\, f_{g_{T}}\right)\right)'\) with
and \(Var\left (\hat {\textbf {f}}_{g}  \textbf {f}_{g}\right) = \textbf {C}_{g}\,\left (\textbf {C}'\textbf {R}^{1}\,\textbf {C} + \textbf {B} \right)^{1}\,\textbf {C}'_{g}\), and h_{(1−α)} is the (1−α) quantile of
The quantile h_{(1−α)} can be approximated using simulations. First we simulate from realization of (5) and computation of (7) can be repeated for a large number of times, say N times, to obtain \(\tilde {h}^{1}_{1\alpha }, \tilde {h}^{2}_{1\alpha }, \ldots, \tilde {h}^{N}_{1\alpha }\). The value with rank N×(1−α) is used as h_{1−α}.
The proposed semiparametric mixed models were fitted with the the gamm function available in R package mgcv [29] and the linear mixed models using the lme function available in R package nlme.
Results
Patients baseline characteristics
A total of 534 adult diabetic patients were in the study, of which 342 (64.04%) were male, 399 (74.72%) were Type 2 diabetic patients and 417 (78.09%) didn’t have family history of diabetes. The patients mean (SD) age at the first visit (or baseline) was 45.40 (14.62) years and ranges between 18 and 93 years, weight was 62.83 (13.36) kgs and FBS level was 164.72 (86.20) mg/L. There were significant differences of these means between Type 1 and Type 2 diabetic groups (Table 2).The results in Table 2 also show that, at baseline there was a significant association between family history of diabetes and type of diabetes (pvalue <0.0001). However, the association between patient gender and type of diabetes was nonsignificant (pvalue = 0.9935). The median (first quartile  third quartile) time between first and last clinic visits of patients was 15.25 (7.25  24.75) months and ranged from as few as 0.5 month between visits to as much as 6 months between visits.
Parametric mixed models
Mean structure
The main interest of this study is to apply semiparametric mixed models, however for comparison purpose here we start the analysis by fitting parametric mixed models. Scatter plot smoothing was used to examine changes in FBS level over time and also to asses the interactions of each categorical covariate with time [30, 31]. The smoothing plots suggest the changes in FBS can be described by quadratic trend. Furthermore, due to the noncrisscrossing of trends representing Type 1 and Type 2 diabetes groups, and with family history and no family history groups Type × time and Family history × time were not included in the mixed models. However, the trend representing male and female crossing at one time point. Therefore, we start with very general model that includes time (in quadratic form), other fixed effects and the necessary interactions, that is
where Type and F.History represent diabetes type and family history of diabetes, respectively.
Variancecovariance structure for random effects
The above mean structure fitted with subjectspecific random intercepts, linear random time effects and quadratic random time effects. For each of the models, the independent error structure is assumed and the results are given in Table 3.
The fixed effect estimates were consistent in sign but have slight differences in magnitude across the three different variancecovariance structures. The variables age, gender, diabetes type, family history, and time by weight and gender by time interactions were statistically nonsignificant in all models, except for time by weight interaction where its pvalue marginally significant for subjectspecific random intercept and slope model (i.e. linear random effects model). The covariates that were statistically significant at 5% level, i.e. Time, Time^{2} and weight and the time by weight interaction were retained for the subsequent analysis.
The Crainiceanu and Ruppert [22] RLRT statistic for testing \(H_{0}: \sigma ^{2}_{b_{0}}=0\) against \(H_{1}: \sigma ^{2}_{b_{0}} > 0\) in model M_{1} takes the value RLRT = 738.24 with pvalue <0.0001. The large value of the test statistic or a very small pvalue strongly suggests a rejection of the null hypothesis (i.e. \(H_{0}: \sigma ^{2}_{b_{0}}=0\)) that no subjectspecific random effects should be included in the model. Similar tests were conducted using the exact finite sample null distribution of the RLRT statistic of Greven et al. [21] to test \(H_{0}: \sigma ^{2}_{b_{1}}=0\) against \(H_{1}: \sigma ^{2}_{b_{1}} > 0\) and \(H_{0}: \sigma ^{2}_{b_{2}}=0\) against \(H_{1}: \sigma ^{2}_{b_{2}} > 0\) in models M_{2} and M_{3}, respectively. The RLRT statistic is 3.944 with pvalue = 0.0207 for \(H_{0}: \sigma ^{2}_{b_{1}}=0\) indicating rejection of the null hypothesis which implies the need for subjectspecific random slopes. Whereas the RLRT statistic for \(H_{0}: \sigma ^{2}_{b_{2}}=0\) is 0.639 with pvalue = 0.1859 suggesting a nonrejection of the null hypothesis \(H_{0}: \sigma ^{2}_{b_{2}}=0\) which implies no quadratic random effect should be included in the model. Therefore, in the subsequent analysis we use the following parametric linear mixed model, called M_{4}:
The analysis results for model M_{4} are presented in Table 4. Except the time × weight interaction effect, which is marginally nonsignificant at 5% level, all the fixed effects are highly significant.
Semiparametric mixed model
The observed mean FBS level profile of patients is shown in Fig. 1b. The plot in this figure shows that the linearity assumption is not reasonable. Therefore, the analysis had to account for the longitudinal data structure and the observed nonlinearity of FBS level estimated with smooth effects in the mixed model framework. Given our specific interest in rate of change in FBS level due to clinic interventions, its functional form (over time) can affect the rate of change. Since the rate of change involves taking derivatives of the smooth function f(t_{ij}), we choose to use quadratic penalized spline to model the FBS level mean response [31]. Following results from the previous section, we propose the following semiparametric mixed model with linear random effects structure, called M_{5}
Using appropriately constructed matrices this model can be represented using a matrix notation of “Tests for zero variance components” section. This model is fitted using the random intercept and linear random effects variance structures of the previous section and the results are displayed in Table 5.
The results in Table 5 show that the fixed effects estimates were consistent in sign but have slight difference in magnitude in both semiparametric and parametric mixed models (see Table 4), except for the effect of time where both the sign and magnitude of its coefficient estimates were different in the two models and the effect of "time square" was nonsignificant in the semiparametric mixed models. Further, the interaction of weight with time was not statistically significant in any of the semiparametric mixed model. Except for the subjectspecific random slope variance component, there is a slight decrease in subjectspecific random intercept and residual variance components in the semiparametric model compared to variance components in the linear mixed model M_{4} (see Table 4).
To compare the two variance structures under the semiparametric mixed model given in Eq. (10), we computed AIC, BIC and adjusted AIC (see Table 6). Adjusted AIC shows that the semiparametric mixed model with subjectspecific intercepts as well as slopes (or random linear effects) value is smaller than that of the random intercept. Therefore, the semiparametric model with random linear effects is the preferred model.
Model selection
In this section we are focusing on assessing whether the inclusion of spline effects improves model fit compared to parametric counterpart. This is equivalent to testing \(H_{0}: \sigma ^{2}_{b} = 0\) versus \(H_{1}: \sigma ^{2}_{b} > 0\) in model M_{5}, where \(\sigma ^{2}_{b}\) is a variance component controlling the smoothness of
The approximate Ftest statistic for testing the above hypotheses, i.e. quadratic form of f(t_{ij}) against a quadratic penalized splines, is 83.63 with pvalue <0.0001. This strongly suggests a rejection of the null hypothesis \(H_{0}: \sigma ^{2}_{b} = 0\). Thus, the shape of the function f(t_{ij}) is statistically different from a quadratic trend.
Furthermore, consider the semiparametric mixed model M_{5} in Eq. (10) with random linear effects variancecovariance structure and the linear mixed model M_{4} in Eq. 9. The fit statistics from fitting these two models are displayed in Table 6. The −2 log(Lik), AIC and BIC values indicate a substantial improvement in the fit of M_{5} compared to M_{4}, implying model with penalized spline representation of FBS level was preferred over its parametric counterpart.
The overall results show that, out of the models evaluated, FBS level of diabetes patients at the JUSH diabetic clinic during the study period best characterized by a penalized spline model with truncated quadratic basis, with subjectspecific random intercept and slope effects and with linear function of weight and time, called the final model, M_{6}.
Simultaneous confidence band
The first derivative of mean response function, i.e. \(\hat {f}'(.)\), with respect to time was estimated for the final model, M_{6} holding weight constant. The rate of change in mean response of FBS level then investigated using the 95% simultaneous confidence bands for the model. The confidence bands were constructed following the discussion in “Estimation of parameters” section. A gride g of time points (0, 35) were defined by increments of one month such that there are T=36 equally spaced time points. The resulting simultaneous confidence bands displayed in Fig. 2 where the solid line and shaded region represent, respectively, the mean predicted FBS level and the confidence bands. Visual inspection indicates that on average diabetes patients were able to decrease or control their FBS level, due to JUSH clinic interventions, in the first five months period after their initial visit. However, after month 5, the slope of the curve starts changing it signs, this might imply that patients do actually not followup the intervention packages properly or not come to the clinic for treatment due to some unknown reason.
The confidence bands become noticeably wider after 27 months of followup period, demonstrating the increased variability. This increase may be due to a smaller number of FBS level recordings being observed at the later period of the study or a potential artifact induced by the spline smoothing [32]. In practice spline smoothing creates a challenge in semiparametric regression settings through the inherent bias from using truncated basis functions. The confidence bands obtained for FBS level does not account for this function bias. However, this bias could be corrected, e.g. using bootstrapping methods [33].
Discussion
This study focused on longitudinal data analysis of fasting sugar level of adult diabetic patients at Jimma University Specialized Hospital diabetic clinic using an application of semiparametric mixed model. The study revealed that the rate of change in FBS level in diabetic patients, due to the clinic interventions, does not continue as a steady pace but changes with time and weight of patients. Furthermore, it clarified the associations between FBS level and some characteristics of adult diabetic patients that weight of a diabetes patient has a significant negative effect whereas patient gender, age, type of diabetes and family history of diabetes did not have a significant effect on the change of FBS level. The result on gender agrees with the findings of [34] where the gender effect on fasting blood glucose level of type 2 diabetes was statistically nonsignificant.
Under the two variancecovariance structures of subjectspecific random effects, the semiparametric mixed models had better fit than their parametric counterparts. This was likely due to the localized splines which captured more variability in FBS level than the linear mixed models. The methodology used in the analysis has implications for clinical monitoring in regular followup of diabetic patients and to assess the effect of intervention packages, such as healthy diet, physical activity and prescribed medicines, because individualized curve may be obtained to follow patientspecific FBS level trends [31].
The main limitation of the study is the limited information on important predictors such as type of interventions including treatment types and nutritional status of a patient that may have influenced the rate of change in FBS level. Due to lack of data on these potential predictors for most of the patients involved in the study, we were unable to include them in the analyses. Therefore, more public health and epidemiology researches are needed to examine the impact of treatments and interventions on population health in general and in particular, people living with diabetes to avoid its complications over time and to identify new risk factors for diabetes.
Conclusion
In this paper, we demonstrate the use of semiparametric mixed effect model for estimation of the rate of change of fasting blood sugar (FBS) level experienced by patients over time. The proposed method can help a physician in clinical monitoring of diabetic patients and to assess the effect of intervention packages such as healthy diet, physical activity.
Abbreviations
 AIC:

Akaike’s information criteria
 BIC:

Bayesian information criteria
 FBS:

Fasting blood sugar level
 IDF:

International diabetes federation
 JUSH:

Jimma University specialized hospital
 PIRLS:

Penalized iteratively reweighted least squares
 REML:

Restricted maximum likelihood
 RLRT:

Restricted likelihood ratio tests
 SD:

Standard deviation
References
 1
World Health Organization and others: Definition, Diagnosis and Classification of Diabetes Mellitus and Its Complications: Report of a WHO Consultation. Part 1, Diagnosis and Classification of Diabetes Mellitus. 1999. World Health Organization and others.
 2
Campos C. Chronic hyperglycemia and glucose toxicity: pathology and clinical sequelae. Postgrad Med. 2012; 124(6):90–7.
 3
Rahman S, Rahman T, Ismail AAS, Rashid ARA. Diabetesassociated macrovasculopathy: pathophysiology and pathogenesis. Diabetes Obes Metab. 2007; 9(6):767–80.
 4
Lu CC, Jiann BP, Sun CC, Lam HC, Chu CH, Lee JK. Association of glycemic control with risk of erectile dysfunction in men with type 2 diabetes. J Sex Med. 2009; 6(6):1719–28.
 5
Enzlin P, Rosen R, Wiegel M, Brown J, Wessells H, Gatcomb P, Rutledge B, Chan KL, Cleary PA, et al. Sexual dysfunction in women with type 1 diabetes: longterm findings from the dcct/edic study cohort. Diabetes Care. 2009; 32(5):780–5.
 6
World Health Organization. Use of glycated haemoglobin (HbA1c) in diagnosis of diabetes mellitus. Abbreviated report of a WHO consultation. Geneva: World Health Organization; 2011, p. 25. Available from: (http://www.ncbi.nlm.nih.gov/books/NBK304267/).
 7
World Health Organization. Definition and diagnosis of diabetes mellitus and intermediate hyperglycaemia: Report of a WHO/IDF consultation (PDF). Geneva: World Health Organization; 2006, p. 21. ISBN: 9789241594936.
 8
Smith DE. Type 1 Diabetes Risk Factor. 2016. https://www.endocrineweb.com/conditions/type1diabetes/type1diabetesriskfactors.
 9
Leontis LM, HessFischl AM. Type 2 Diabetes Risk Factor. 2014. https://www.endocrineweb.com/conditions/type2adiabetes/type2diabetesriskfactors.
 10
Atlas IDF. International Diabetes Federation 7th Edition, p. 2015. http://www.diabetesatlas.org.
 11
Worku D, Hamza L, Woldemichael K. Patterns of diabetic complications at jimma university specialized hospital, southwest ethiopia. Ethiop J Health Sci. 2010; 20(1):33–39.
 12
SM A, Y B, A W, S A. Increasing trends of diabetes mellitus and body weight: A ten year observation at gondar university teaching referral hospital, northwest ethiopia. PLoS ONE. 2013; 8(3):60081.
 13
Thilakarathne PJ, Clement L, Lin D, Shkedy Z, Kasim A, Talloen W, Versele M, Verbeke G. The use of semiparametric mixed models to analyze pamchipⓇpeptide array data: an application to an oncology experiment. Bioinformatics. 2011; 27(20):2859–65.
 14
Szczesniak RD, Li D, Amin RS. Semiparametric mixed models for nested repeated measures applied to ambulatory blood pressure monitoring data. J Mod Appl Stat Methods. 2016; 15(1):255–75.
 15
Mehari Z. Application of Longitudinal Data Analysis on FBS of Adult Diabetes: Statistical Analysis of Fasting Blood Sugar Level of Adult Diabetic Patients in Jimma University Specialized Hospital. Saarbrücken, Germany: LAP Lambert Academic Publishing 2014; 2014.
 16
Ruppert D, Wand M, Carroll R. Semiparametric regression. Cambridge, UK: cambridge University Press; 2003.
 17
Maringwa JT, Geys H, Shkedy Z, Faes C, Molenberghs G, Aerts M, Ammel KV, Teisman A, Bijnens L. Application of semiparametric mixed models and simultaneous confidence bands in a cardiovascular safety experiment with longitudinal data. J Biopharm Stat. 2008; 18(6):1043–62.
 18
Verbeke G, Molenberghs G. Linear Mixed Models for Longitudinal Data. New York: Springer; 2000.
 19
Stram D, Lee J. Variance components testing in the longitudinal mixed effects model. Biometrics. 1994; 50(3):1171–7.
 20
Scheipl F, Greven S, Küuchenhoff H. Size and power of tests for a zero random effect variance or polynomial regression in additive and linear mixed models. Comput Stat Data Anal. 2008; 52(7):3283–3299.
 21
Greven S, Crainiceanu C, Küuchenhoff H, Peters A. Restricted likelihood ratio testing for zero variance components in linear mixed models. J Comput Graph Stat. 2008; 17(4):870–91.
 22
Crainiceanu CM, Ruppert D. Likelihood ratio tests in linear mixed models with one variance component. J R Stat Soc Ser B (Statistical Methodology). 2004; 66(1):165–185.
 23
Wang Y. Smoothing spline models correlated random errors. J Am Stat Assoc. 1998; 93:341–8.
 24
Durbán M, Harezlak J, Wand M, Carroll R. Simple fitting of subjectspecific curves for longitudinal data. Stat Med. 2005; 24(8):1153–67.
 25
Ni X, Zhang D, Zhang HH. Variable selection for semiparametric mixed models in longitudinal studies. Biometrics. 2010; 66(1):79–88.
 26
Wager C, Vaida F, Kauermann G. Model selection for penalized spline smoothing using akaike information criteria. Aust N Z J Stat. 2007; 49(2):173–90.
 27
Crainiceanu CM, Ruppert D, Claeskens G, MP W. Exact likelihood ratio tests for penalized splines. Biometrika. 2005; 92(1):91–103.
 28
Hastie T, Tibshirani R. Generalized Additive Models. New York: Wiley Online Library; 1990.
 29
Wood S. Generalized Additive Models: an Introduction with R. Boca Raton, Florida: CRC press; 2006.
 30
Hastie T, Tibshirani R. Generalized additive models for medical research. Stat Methods Med Res. 1995; 4(3):187–96.
 31
Szczesniak RD, McPhail GL, Duan LL, Macaluso M, Amin RS, Clancy JP. A semiparametric approach to estimate rapid lung function decline in cystic fibrosis. Ann Epidemiol. 2013; 23(12):771–7.
 32
Szczesniak RD, Li D, Raouf AS. Semiparametric mixed models for medical monitoring data: An overview. J Biom Biostat. 2015; 6(2):234. https://doi.org/10.4172/21556180:1000234.
 33
Hardle W, Bowman AW. Bootstrapping in nonparametric regression: Local adaptive smoothing and confidence bands. J Am Stat Assoc. 1988; 83(12):102–110.
 34
Khan HA, Sobki SH, Alhomida AS. Regression analysis for testing association between fasting blood sugar and glycated hemoglobin in diabetic patients. Biomed Res. 2015; 26(3):604–6.
Acknowledgements
We thank the Diabetic Clinic of Jimma University Specialized Hospital, Ethiopia for giving access to the data and the staff members of the clinic for their support in extracting the information from patient’s medical card.
Funding
This study project was awarded funding from College of Natural Sciences Research and Postgraduate Office. The second author’s Jimma University visit, when this study started, was financially supported by the Africa Interaction, Knowledge, Interchange and Collaboration (KIC) funding (UID: 105298) from the National Research Foundation (NRF) and the University of South Africa under the Foreign general research grant.
Availability of data and materials
The data sets analyzed in this study available from the corresponding author on reasonable request.
Author information
Affiliations
Contributions
BBY and WKY contributed to the study concept and design. ZMN participated in the data collection and checked the data. TTA performed the statistical analyses and drafted the manuscript. LKD reviewed the findings of data analyses and was a major contributor in writing the manuscript. All authors approved the final manuscript.
Corresponding author
Correspondence to Tafere Tilahun Aniley.
Ethics declarations
Ethics approval and consent to participate
Ethical approval to conduct the study and human subject research approval for this study was received from Jimma University, College of Natural Sciences, Research Ethics Committee and the medical director of the Hospital. As the study was retrospective, informed consent was not obtained from the study participants, but data were anonymous and kept confidential.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Cite this article
Aniley, T., Debusho, L., Nigusie, Z. et al. A semiparametric mixed models for longitudinally measured fasting blood sugar level of adult diabetic patients. BMC Med Res Methodol 19, 13 (2019). https://doi.org/10.1186/s128740180648x
Received:
Accepted:
Published:
Keywords
 Diabetes mellitus
 Fasting blood sugar
 Linear mixed model
 Semiparametric mixed model