 Research
 Open Access
 Published:
Dynamic prediction based on variability of a longitudinal biomarker
BMC Medical Research Methodology volume 21, Article number: 104 (2021)
Abstract
Background
Tacrolimus is given postkidney transplant to suppress the immune system, and the amount of drug in the body is measured frequently. Higher variability over time may be indicative of poor drug adherence, leading to more adverse events. It is important to account for the variation in Tacrolimus, not just the average change over time.
Methods
Using data from the University of Colorado, we compare methods of assessing how the variability in Tacrolimus influences the hazard of de novo Donor Specific Antibodies (dnDSA), an early warning sign of graft failure. We compare multiple joint models in terms of fit and predictive ability. We explain that the models that account for the individualspecific variability over time have the best predictive performance. These models allowed each patient to have an individualspecific random error term in the longitudinal Tacrolimus model, and linked this to the hazard of dnDSA model.
Results
The hazard for the variance and coefficient of variation (CV) loading parameter were greater than 1, indicating that higher variability of Tacrolimus had a higher hazard of dnDSA. Introducing the individualspecific variability improved the fit, leading to more accurate predictions about the individualspecific timetodnDSA.
Conclusions
We showed that the individual’s variability in Tacrolimus is an important metric in predicting longterm adverse events in kidney transplantation. This is an important step in personalizing the dosage of TAC posttransplant to improve outcomes posttransplant.
Background
In many medical settings, the variability in patient drug levels is hypothesized to predict adverse events, as higher variability is often indicative of drugdrug interactions, altered drug metabolism, or poor drug adherence. This is the case in the kidney transplant setting: Tacrolimus (TAC) is prescribed posttransplant to suppress the immune system, and the amount of drug in the body is measured frequently via blood draw. In scenarios such as this, it is important to account for the variation in the longitudinal outcome, not just the average change over time. In this paper, we test whether the variability in longitudinal TAC predicts timetode novo Donor Specific Antibodies (dnDSA), an early warning sign of kidney transplant graft failure.
Clinicians need accurate predictions of adverse allograft events in order to know when and how to modify the treatment course for high risk patients, such as altering immunosuppression therapy or increasing followup visit frequencies. This is critically important today, as there exists a national shortage of available kidney donors [1]. If adverse events can be avoided, the available donated organs can be used more effectively, and in turn the kidney transplant waiting list, currently at nearly 100,000 patients [1], will be shorter. Avoiding antibodymediated rejection, or organ rejection due to de novo donorspecific antibodies (dnDSA), is one of the most important steps in achieving longterm graft survival [2]. It has been shown that maintaining proper immunosuppression therapy levels posttransplant is a major determinant in preventing dnDSA. Tacrolimus (TAC) is the primary immunosuppressant used by U.S transplant centers (93%) [3, 4], but it’s therapeutic dose window is narrow and differs by patient.
TAC trough levels, or the lowest amount of drug in the blood prior to the next dose, are measured frequently posttransplant. In previous studies, onedimensional functions of observed TAC values were found to be associated with dnDSA; these functions included the percentage of TAC levels below 5 ng/ml [5], average TAC troughs of less than 8 ng/ml, and the percentage of time TAC was in therapeutic range [6]. Importantly, the variability of TAC was also found to be associated with dnDSA, as high variability likely indicates inadequate drug exposure and possible nonadherence. In previous studies, the coefficient of variation (CV) of TAC, summarized over a specific range of time prednDSA, was predictive of dnDSA. [7–9]. The CV is calculated as the ratio of the standard deviation to the mean, and is a way to standardize the variability of a measurement so it is not dependent on the average value [10].
To complicate the problem further, dnDSA is only screened for periodically, and much less frequently than TAC trough levels are drawn. This makes dnDSA an interval censored outcome, as the exact timing of development is unknown. The majority of studies analyzing the relationship between TAC and dnDSA treat dnDSA as right censored due to the analytic complexities associated with intervalcensored outcomes. A previous study used a joint model for interval censored data that incorporated all longitudinal values of TAC and found that both average TAC and the slope of TAC over time were associated with timetodnDSA [4]. This approach used a linear mixed model for TAC linked by a random intercept and random slope to an interval censored parametric survival model for dnDSA. Joint models such as this have become increasingly popular as a way to simultaneously model longitudinal and timetoevent data. These approaches typically use a standard linear mixed model in combination with a survival model to assess the effect of the longitudinal outcome on the timetoevent outcome. These two model processes can be linked in a number of ways, including shared random effects, the true value, or the slope of the longitudinal trajectory. More recently, research has been done into allowing for a more flexible linking function [11, 12].
In this paper, we build upon the previously mentioned joint model for TAC and dnDSA [4] to test the hypothesis that an individual’s variability of TAC over time is associated with timetodnDSA. We allow for each individual’s TAC trajectory to be normally distributed with heterogeneous variances, which avoids the unlikely assumption that all individuals share a common variance. We test whether this individualspecific variability is associated with timetodnDSA and whether it improves the predictive ability of the model. We also assess the model’s ability to dynamically predict a patient’s probability of developing dnDSA after surviving dnDSAfree for a specified period. This is motivated by the clinical scenario where patients are seen in clinic at 6 months or 1 yearpost transplant dnDSAfree, and the clinician wants to assess the probability that the patient remains dnDSAfree until a future appointment. We demonstrate how these probabilities can be calculated using the joint model, and how accurate these predictions are given the dataset at hand.
In the first section, we introduce the kidney transplant study data. Next, the joint models are formulated and applied to the motivating kidney transplantation dataset. Model estimation and prediction assessments follow, and we end with a discussion of study limitations and next steps.
Kidney transplant study
As a motivating example, we introduce a retrospective study of patients who received a kidney transplant at the University of Colorado between 2007 and 2013. Exclusion criteria included being under 18 at the time of transplant, having a simultaneous liver and kidney transplant or islet cell transplant, having DSA pretransplant or dnDSA within the first week posttransplant, not having any dnDSA screening posttransplant, and not receiving TAC posttransplant. This study was conducted in accordance with the Declaration of Helsinki and was approved by the ethics committee at the Colorado Multiple Institutional Review Board (COMIRB 133137).
Posttransplant, TAC trough levels and dnDSA screening results for each of the 538 included patients were collected for up to 7 years. Because of the unreliability of TAC and dnDSA data within the first week posttransplant, followup data was collected starting at day 8 posttransplant. Individuals had a median of 22 (IQR: 1532) qualifying TAC levels, taking on values from 0 to 30 and roughly following a normal distribution. Screening for dnDSA was conducted at 1, 6, 12 months, annually, and when clinically indicated. Approximately onethird (n=181) had at least one dnDSA during followup, and for the purpose of this study, we focused on timetofirst dnDSA. Additional variables collected included age at transplant, race/ethnicity, and the number of Human Leucocyte Antigen (HLA) mismatches between donor and recipient. A more detailed description of the dataset and measures collected can be found elsewhere [4, 6].
We randomly split the dataset into a training (2/3, N=358) and a testing (1/3, N=180) cohort. The training cohort was used to fit all models outlined in the results, while the testing cohort was used to assess predictive performance.
Methods
General model formulation
Suppose there are N individuals each measured at n_{i} time points, i=1,...,N. Let y_{ij} be the measurement for individual i at time t_{ij},j=1,...,n_{i}. We assume that given the vector of individualspecific random intercept and slope (a_{0i},a_{1i})^{′}, individual outcome measures y_{ij} are independent and normally distributed with mean μ_{ij} and residual variance \(\sigma ^{2}_{l}\), where
Some examples of f, the function that models the time component, include a linear trajectory, i.e. f(t_{ij})=b_{0}+b_{1}t_{ij}; or bspline basis functions, i.e. \(f(t_{ij})=\sum _{k=1}^{K} \beta '_{0k} \psi _{k}(t_{ij})\), where ψ_{k}(t_{ij}) is the k^{th} of K basis functions with coefficient β_{0k} and the number of basis functions K depends on the degree of the splines (e.g. p=3) and number of inner knots (h=3; K=p+h+1). Baseline covariates are represented by x_{1i}, with regression coefficients β1′. The random effects a_{0i},a_{1i} are distributed normally with mean zero and variance/covariance matrix
Importantly, the model allows a flexible residual variance component, \(\sigma ^{2}_{l}\). In this paper, we either set l to 1, which assumes all individuals have a common residual variance, or we set l to i, which allows all individuals to have different residual variances. This flexible variance component avoids the typical assumption that all subjects share one common residual error term, in favor of allowing the model to estimate higher or lower variance depending on the amount of variability in each subject specific trajectory.
The same N individuals are measured for a time to event outcome at varying time points. The outcome is interval censored, as it is only known to happen between two discrete time points and the exact timing is unknown. Let t_{Ri} be the time at which the outcome was detected and t_{Li} the time of the visit immediately preceding t_{Ri}. The actual unknown time to event, t_{i}, occurs in between t_{Li} and t_{Ri}. The hazard for individual i at any time t can be represented by
where h_{0}(t) is the baseline hazard, which can be various flexible distributions, including parametric (typically exponential, Weibull, or gamma), piecewise constant or spline. β_{0} represents the fixed intercept, x_{2i} is the vector of fixed covariates, which can be different from the fixed covariates in Eq. 1, and β_{2} is its coefficient. Of note, we restrict the longitudinal measurements (y_{ij}) such that only those occurring prior to t_{Li} are used in the joint model.
The two submodels (1) and (3) are connected by the shared random effects (a_{0i},a_{1i})^{′}, and some function of the history of the longitudinal biomarker, g_{i}. This function g_{i} can be chosen based on a priori hypotheses, and should include an estimated parameter from the longitudinal model (1). In our case, g_{i} will be a function of the heterogeneous random error of the biomarker, σ_{i}. The association parameters λ_{0},λ_{1}, and λ_{2} represent the estimated risk of event dependent on the average biomarker, slope of the biomarker, and the function, g_{i}.
Conditional on random effects, a_{0i},a_{1i}, the log likelihood can be written for each individual in the model with individualspecific random error (\(\sigma ^{2}_{l}=\sigma ^{2}_{i}\)):
where I_{Ri} is an indicator that the event for individual i is right censored (1=right censored, 0=interval censored), and F_{i}(t) is the cumulative density function at time t.
Model specifications
Four models were fit to test the clinical hypothesis that the variability of the biomarker over time would be associated with the time to event outcome. These are all special cases of the joint model outlined in “General model formulation” section.

Model 1 (M1): Base model. l=1, so all patients have a common residual error, and g_{i}=0, so there are only two shared parameters: the random intercept and slope.

Model 2 (M2): Individualspecific variance model. l=i, so each patient is allowed a different residual error term, and g_{i}=0, so again, there are only two shared parameters: the random intercept and slope.

Model 3 (M3): Individualspecific variance shared model. l=i, so each patient is allowed a different residual error, and g_{i}=σ_{i}, so λ_{2} represents the hazard of event associated with the residual standard error.

Model 4 (M4): Individualspecific coefficient of variation (CV) shared model. l=i, so each patient is allowed a different residual error, and \(g_{i}=\frac {\sigma _{i} }{\frac {1}{n_{i}}\sum _{j=1}^{n_{i}}{y}_{ij}}\), so λ_{2} represents the hazard of event associated with the coefficient of variation. Of note, the CV is calculated here as the estimated standard deviation over the sample mean.
Model selection and predictive ability
The Deviance Information Criterion (DIC) [13] and the WatanabeAkaike information criterion (WAIC) [14] were used for model selection. The DIC is calculated based on the deviance, D(θ)=−2 logL(yθ) and penalizes for the number of effective degrees of freedom in the model. WAIC approximates leaveoneout crossvalidation [14] and works well for hierarchical models where the number of degrees of freedom is not obvious [15]. The best model should have the lowest values of DIC and WAIC; differences of at least 5 points were considered important.
The predictive ability of the four models was assessed using Area Under the Curve (AUC) and Brier Score (BS). The interval censored nature of dnDSA complicates the calculation of AUC and BS. In rightcensored survival settings, the AUC is the proportion of concordant pairs among all subject pairs, where the pair achieves concordance if the risk of event is higher for the subject with the earlier event time [16]. Under interval censoring, it may not be clear which subject in the pair has the earlier event time. When a parametric baseline hazard is employed, such as the Weibull, various methods for calculating the probability that the event time of the first subject is less than the event time of the second have been proposed [17, 18]. We used both the nonparametric estimator approach proposed by Wu and Cook and the probability based methods outlined by Tsouprou to calculate AUC, and only reported the Tsouprou results as the two both showed the same conclusions. BS is defined as the mean squared difference between the predicted probability and the observed outcome for each subject at a prespecified followup time. Due to the interval censored nature of the data, is not always known if the outcome occurred before a certain time, so again we calculate the probability that the patient had the event prior to a prespecified time [17].
Due to the timevarying nature of the models, the AUC and BS can change over time. To assess the predictive ability of the models at differing time points for differing prediction windows, we chose two clinically relevant scenarios: (1) being dnDSAfree at 12 months, given the patient was dnDSAfree at 6 months, and (2) being dnDSAfree at 24 months, given the patient was dnDSAfree at 12 months.
Dynamic prediction of survival probabilities
In order to compare predictive abilities of the proposed models (M1M4), we calculate the dynamic predictions of surviving dnDSAfree at a prespecified time point. Several works show the derivation of the conditional predictive probability of survival for a new individual at time t^{′}, given that the individual has survived up to time t, where t^{′}>t [19, 20]. Let \(y^{t}_{j} \) denote the longitudinal trajectory up to time t, and T^{∗}>t denote that the new individual has survived up to time t. An important step of this calculation is obtaining samples of the conditional posterior distribution of the random effects for this new individual \(a_{N} \left (a_{N}=(a_{N0}, a_{N1})'\right), p\left (a_{N} \mid T^{*} >t, y^{t}_{j}, \theta \right)\), where θ denotes the set of all parameter values of the model. Conditional on the mth MCMC sample θ^{m}, we draw the mth sample of the random effects vector from its posterior distribution
where \(p\left (y^{t}_{j} \mid \theta ^{m}, a_{N}\right)\) and p(T^{∗}>t∣θ^{m},a_{N}) are the conditional probabilities of the longitudinal and survival outcomes, and p(a_{N}∣θ^{m}) is the probability of the random effects vector. We follow ideas in [20] and employ adaptive rejection Metropolis Hastings sampling to draw samples of the random effect (5) and calculate the predicted survival probabilities from the MCMC samples.
Survival probabilities may be obtained using any time points (t^{′} and t, t^{′}>t), but for simplicity and for clinical applicability, we chose to assess two scenarios: (1) being dnDSAfree at 12 months, given the patient was dnDSAfree at 6 months (t=6,t^{′}=12), and (2) being dnDSAfree at 24 months, given the patient was dnDSAfree at 12 months (t=12,t^{′}=24). Oftentimes, the patient comes to clinic for a milestone appointment (say 1 yearpost transplant) appointment, and the clinician wants to assess the probability of the patient continuing to remain dnDSAfree until the next appointment (at 2 years posttransplant, for example). We used this framework to assess and compare the Area Under the Curve (AUC) and Brier Score (BS) for each model under each of the time period scenarios. For particular patients, we also calculated the dynamic predictions at more time points, specifically where t was 12 months and t^{′} ranged from 18 months to 60 months, to demonstrate how the survival probabilities can be calculated for longer followup periods.
Application
Five hundred and thirty eight patients met the final inclusion criteria for this analysis. The dataset was randomly split into a training (2/3, N=358) and a testing (1/3, N=180) cohort. The four models outlined in the methods were fit to the training cohort (N=358). Each model had a longitudinal submodel (Eq. 1) with TAC as the outcome, a fixed intercept and slope for time (f(t_{ij})=b_{0}+b_{1}t_{ij})) and a random intercept and slope for time for variations across individuals. Spline terms for time were tested, but DIC and WAIC indicated that a linear trend was sufficient. For the survival submodels (Eq. 3), various baseline hazard functions were tested, including exponential, Weibull, gamma, and piecewise constant, and the Weibull yielded the best model fit (h_{0}(t)=αt^{α−1}, where α is the Weibull shape parameter). All survival submodels had time to dnDSA as the interval censored outcome, random effects associated with the longitudinal submodel and a vector of baseline covariates for age (younger age: <30, middle age: 30 −49, older age: 50 +), race/ethnicity (Caucasian, African American, Hispanic, and Other), and number of HLA mismatches. More detail on each of these characteristics and why they are hypothesized to be associated with dnDSA can be found elsewhere [6]. All models built upon the base model, M1, which had a shared random intercept and slope. As described in “Model specifications” section, M2 allowed each subject to have an individualspecific residual error term, M3 shared this individualspecific error term with the hazard submodel, and M4 shared the individualspecific coefficient of variation with the hazard submodel. Thus, the final implemented models followed this general structure:
Model estimation and inference was based on the Bayesian framework. Hyperparameters are assigned weakly informative priors:
Denoting u as a generic random variable, we define the following density functions: u∼Normal(μ,σ^{2}) with probability density \(1/(\sigma \sqrt {2\pi })\exp {\{(u\mu)^{2}/2\sigma ^{2}\}}\); u∼Uniform(a,b) with probability density 1/(b−a) for a<u<b; u∼Gamma(r,λ) with probability density λ^{r}u^{r−1}exp(−λu)/ Γ(r); and Ω∼Wishart(R,k) with probability density (∣Ω∣^{(k−p−1)/2}∣R∣^{k/2} exp{−Tr(RΩ/2})/(2^{pk/2}Γ_{p}(k/2)) for k≥p, where p=2, Tr is the trace function and Γ_{p} is the multivariate gamma function. Notably, we defined the prior for log(σ_{l}) as uniform(A, A), where A is sufficiently large (we used A=100).
The posterior distribution of each variable was estimated using Markov Chain Monte Carlo (MCMC) simulations. A Gibbs sampler was used to form two Markov chains using JAGS software [21] and the runjags package in R [22]. Convergence of the samples was assessed by trace plot inspection, and Gelman and Geweke tests which test for equality of the means of the first and last part of a Markov chain [23]. After a burnin period of 20,000 and thinning of 50 of 100,000 sampling iterations, 2,000 samples per chain were used for inference. On average, each model took approximately 3.5 hours to run on a PC with two Intel Xeon X5650 processors and 128 GB RAM. For dynamic prediction, 500 MCMC samples were used, and within each sample, 500 iterations of the Metropolis Hastings algorithm were employed to estimate the random effects. Calculating a prediction for one subject took approximately 40 minutes on the same PC, using parallel processing on 12 cores. A reproducible example of M4, including code to fit the model and calculate dynamic predictions, can be found in the Supplemental Materials.
Results
A comparison of WAIC and DIC for each model, M1M4, is reported in Table 1. As shown by lower DIC and WAIC, all three models that allowed for individualspecific variation (M2, M3, M4), performed better than the base model that forced a common residual error term. The model that fit the data best according to DIC allowed for individualspecific residual error and shared the CV term with the hazard of event. M2 had the lowest WAIC. M2M4 also had superior predictive ability of dnDSAfree survival compared to M1, demonstrated by higher AUC and lower BS (Fig. 1 and Table S1).
The model estimates are found in Table 2. The resulting estimates for the TAC submodel (fixed intercept, slope, covariance matrix of random intercept and slope) are similar across the M1M4. M1 has a common random error term for all patients, which was estimated as 7.15 (95% Credible Interval (CrI): 6.93, 7.36). All other models allowed for residual error to vary by patient, and the estimated standard deviations from M4 ranged from 0.46 to 11.16 (Fig. 2). The hazard ratios for all baseline covariates (β_{0}−β_{6}) are similar in magnitude across the models. The loading parameters for the shared random effects (λ_{0},λ_{1}) are similar across the models, although the λ_{1} is slightly lower for M1 compared to the other three. λ_{2} represents the effect the individualspecific residual standard deviation (M3) and coefficient of variation (M4) on the hazard of dnDSA. For every one unit increase in an individual’s estimated residual standard deviation, the risk of dnDSA increases 1.25fold (95% CrI: 1.00, 1.54), and for every one unit increase in an individual’s estimated CV, risk of dnDSA increases 1.40fold (95% CrI: 0.12, 5.03).
Dynamic predictions for experiencing dnDSAfree survival were calculated for two individuals in Fig. 3. Both individuals were dnDSAfree at 12 months, and one went on to experience a dnDSA event, while the other did not. The probability of surviving dnDSAfree from 12 months to 60 months, by an increment of 6 months, was calculated for each individual, separately using the output from M1 and M4. In the lefthand panel is an individual who had a low CV of TAC during the first year posttransplant. This patient did not go on to experience a dnDSA event within 5 years posttransplant, and the estimated probability of dnDSAfree survival is higher using M4 compared to M1. In the righthand panel, an individual who had a high CV of TAC during the first year posttransplant is presented. This patient went on to develop TAC sometime between 48 and 54 months posttransplant, and the patient’s probability of dnDSAfree survival is lower using M4, compared to M1. For both of these individuals, M4 yielded a better prediction of dnDSA compared to M1.
Simulation study
We conducted a simulation study to evaluate the performance of a model that allowed for individualspecific variances on a finite sample. We used M4 to simulate data, as this was the model with the most complexity. The true values of each parameter, listed in Table 3, were set to be close to the resulting estimates from the training dataset. Simulation results were obtained from 200 datasets, each with 200 individuals. Models were run for 1,000 iterations, after a 1,000 iteration burnin time. As shown in Table 3, all empirical means (means of the 200 estimates) are close to the true values, except there was a small bias in α, the shape of the Weibull baseline hazard function, and in λ_{2}, the association parameter for g_{i}, the individualspecific CV.
Discussion
Tacrolimus is the most important immunosuppressant drug in solid organ transplantation despite having a very narrow therapeutic window. Ensuring appropriate drug exposure within this window is critical in order to maintain health of the graft while avoiding drug toxicities. However, understanding intrapatient drug exposure has been challenging, as only using average drug levels, or onedimensional summary statistics of drug levels, has several disadvantages that don’t represent a complete picture of TAC exposure over time. The variability of TAC, using all longitudinal measures, may be a useful way to help characterize TAC drug exposure, as variability often leads to more time out of therapeutic range and places patient at risk for adverse outcomes [6]. In this paper, we demonstrated that incorporating an individualspecific variance term into the modeling of TAC and dnDSA indeed improves model fit and prediction of dnDSA.
The random effects in the joint model induce the dependence between the two submodels, which allows us to test whether the variance and CV of longitudinal TAC is associated with timetodnDSA, while accounting for the unobserved heterogeneity in these functions. Unlike the random effects, the biomarker’s variability (embodied in the CV) has an intuitive clinical interpretation. In our joint model context, its contribution is readily quantified as a hazard of experiencing a dnDSA event. If we observe two patients that only differ in the CV of TAC by one unit, everything else being equal for a specific time, the hazard of dnDSA is 1.4times higher for the individual with the larger CV. Similarly, if two patients differ only in the estimated random error of TAC by 1 unit, the hazard of dnDSA is 1.3fold higher for the patient with the higher variability.
The number of variance parameters estimated by M2M4 increases as the sample size increases. These exchangeable randomeffects arise naturally in Bayesian hierarchical modeling, enabling a flexible individuallevel structure of dependence, and computationally they are very convenient. The estimates at the individuallevel shrink towards a common populationlevel estimate, which intuitively allows for better modeling of any individual’s longitudinal trajectory by borrowing information from similar subjects. In other words, exchangeability allows the use of information from the entire cohort to strengthen the inference for any individual subject. In Bayesian jargon, this is called “borrowing strength” [24]. The DIC and WAIC metrics both account for the effective degrees of freedom, particularly important in models with complex hierarchies of random effects; and despite additional parameters, the models with individualspecific variances proved to be superior to the model with homogeneous variance (Table 1). Additionally, the individuallevel parameters allow individualspecific estimation, improving the dynamic predictions (as described in “Dynamic prediction of survival probabilities” section).
The models with heterogeneous variance (M2M4) proved to have superior predictive performance than M1, as shown by AUC and BS. The AUC was generally higher for predicting at t=12 months, compared to t=6 months, perhaps because we have more available longitudinal measurements to use in the prediction at the later time point. The BS was lower for the scenario of t=6 and t^{′}=12, compared to the scenario of t=12 and t^{′}=24, perhaps because the window of prediction was shorter (6 versus 12 months). Dynamic predictions using this model may be employed as a tool of personalized medicine, providing a guide for the clinical decisions, e.g. by modifying interventions or acting as alerts in responding to the longitudinal profile of the patient. As seen in Fig. 3, incorporating the variability in to the model improved the predictions for these two individuals, which highlights the need to account for the overall trajectory and variability of TAC when assessing risk of dnDSA. Models such as these could be applied to any disease that is treated with medications that have a narrow therapeutic window that require drug monitoring, such as glomerularnephritides, autoimmune conditions, and some cancers.
We have employed some of the most used parameterizations of joint models (e.g. sharing the random effects) as well as one that is less common, which considers the sharing of other characteristics of the longitudinal trajectory, namely the standard deviation of the residual error. In future work, we plan to explore alternatives to JAGS, such as Nimble [25] or Stan [26], as these approaches may speed up the computational time needed to perform the dynamic predictions. Another natural extension of this work would be to adopt the idea of the latent class joint models, which could be used to identify the patients whose prognostic can be improved by adjusting the dose of TAC and patients for whom the dose adjustment will fail. After model adjustment, subjects of different latent classes can be easily interpreted with survival probabilities plots, thus providing a good prognostic tool.
Conclusion
Using a joint model with a flexible linkage function, we demonstrated that an individual’s variability of TAC over time is important in dynamically predicting dnDSA post kidney transplant. The proposed model improved prediction of dnDSA over a method that did not model the variability in TAC, and has potential of improving clinical outcomes in this area of personalized medicine.
Availability of data and materials
The datasets generated and/or analysed during the current study are not publicly available due to being owned by the University of Colorado Transplant Center and containing protected health information, but are available from the corresponding author on reasonable request.
Declarations
Abbreviations
 TAC:

Tacrolimus
 dnDSA:

De novo donor specific antibody
 JM:

Joint model
 MCMC:

Markov Chain Monte Carlo
 CrI:

Credible interval
 HLA:

Human leucocyte antigen
 JAGS:

Just another Gibbs sampler
 DIC:

Deviance information criterion
 WAIC:

WatanabeAkaike information criterion
 IC:

Interval censoring
References
 1
Hart A, Smith J, Skeans M, Gustafson S, Stewart D, Cherikh W, Wainright J, Kucheryavaya A, Woodbury M, Snyder J, et al. OPTN/SRTR 2015 annual data report: kidney. Am J Transplant. 2017; 17(S1):21–116.
 2
Sellares J, de Freitas D, Mengel M, Reeve J, Einecke G, Sis B. Understanding the causes of kidney transplant failure: The dominant role of antibodymediated rejection and nonadherence. Am J Transplant. 2011; 2(12):388–99.
 3
Matas A, Smith J, Skeans M, Thompson B, Gustafson S, Schnitzler M, Stewart D, Cherikh W, Wainright J, Snyder J, et al. OPTN/SRTR 2012 annual data report: kidney. Am J Transplant. 2014; 14(S1):11–44.
 4
Campbell KR, JuarezColunga E, Grunwald GK, Cooper J, Davis S, Gralla J. Comparison of a timevarying covariate model and a joint model of timetoevent outcomes in the presence of measurement error and interval censoring: application to kidney transplantation. BMC Med Res Methodol. 2019; 19(1):130.
 5
Wiebe C, Rush DN, Nevins TE, Birk PE, BlydtHansen T, Gibson IW, Goldberg A, Ho J, Karpinski M, Pochinco D, et al. Class II eplet mismatch modulates tacrolimus trough levels required to prevent donorspecific antibody development. J Am Soc Nephrol. 2017; 28(11):3353–62.
 6
Davis S, Gralla J, Klem P, Tong S, Wedermyer G, Freed B, Wiseman A, Cooper JE. Lower tacrolimus exposure and time in therapeutic range increase the risk of de novo donorspecific antibodies in the first year of kidney transplantation. Am J Transplant. 2018; 18(4):907–15.
 7
SapirPichhadze R, Wang Y, Famure O, Li Y, Kim SJ. Timedependent variability in tacrolimus trough blood levels is a risk factor for late kidney transplant failure. Kidney Int. 2014; 85(6):1404–11.
 8
Rodrigo E, San Segundo D, FernándezFresnedo G, LópezHoyos M, Benito A, Ruiz JC, de Cos MA, Arias M. Withinpatient variability in tacrolimus blood levels predicts kidney graft loss and donorspecific antibody development. Transplantation. 2016; 100(11):2479–85.
 9
Huang CT, Shu KH, Ho HC, Wu MJ. Higher variability of tacrolimus trough level increases risk of acute rejection in kidney transplant recipients. In: Transplantation Proceedings. Elsevier: 2016. p. 1978–80.
 10
Curto JD, Pinto JC. The coefficient of variation asymptotic distribution in the case of noniid random variables. J Appl Stat. 2009; 36(1):21–32.
 11
Rizopoulos D, Ghosh P. A Bayesian semiparametric multivariate joint model for multiple longitudinal outcomes and a timetoevent. Stat Med. 2011; 30(12):1366–80.
 12
Martins R. A Bayesian joint dispersion model with flexible links In: Argiento R, Lanzarone E, Antoniano Villalobos I, Mattei A, editors. Bayesian Statistics in Action. BAYSM2016. Springer Proceedings in Mathematics and Statistics. Springer, Cham: 2017. p. 39–49.
 13
Spiegelhalter DJ, Best NG, Carlin BP, Van Der Linde A. Bayesian measures of model complexity and fit. J R Stat Soc Ser B Stat Methodol. 2002; 64(4):583–639.
 14
Watanabe S. Asymptotic equivalence of Bayes cross validation and widely applicable information criterion in singular learning theory. J Mach Learn Res. 2010; 11:3571–94.
 15
Gelman A, Hwang J, Vehtari A. Understanding predictive information criteria for Bayesian models. Stat Comput. 2014; 24:997–1016.
 16
Kamarudin AN, Cox T, KolamunnageDona R. Timedependent ROC curve analysis in medical research: current methods and applications. BMC Med Res Methodol. 2017; 17(1):53.
 17
Tsouprou S, Putter H, Fiocco M. Measures of discrimination and predictive accuracy for interval censored survival data, [Master’s Thesis]: Leiden University. http://www.math.leidenuniv.nl/scripties/MasterTsouprou.pdf.
 18
Wu Y, Cook RJ. Assessing the accuracy of predictive models with intervalcensored data. Biostatistics. 2020. https://doi.org/10.1093/biostatistics/kxaa011.
 19
Rizopoulos D. Dynamic predictions and prospective accuracy in joint models for longitudinal and timetoevent data. Biometrics. 2011; 67(3):819–29.
 20
Li K, Luo S. Dynamic predictions in Bayesian functional joint models for longitudinal and timetoevent data: An application to Alzheimer’s disease. Stat Methods Med Res. 2019; 28(2):327–42.
 21
Plummer M. JAGS Version 3.3.0 user manual. Lyon: International Agency for Research on Cancer; 2012.
 22
Denwood MJ. runjags: An R package providing interface utilities, model templates, parallel computing methods and additional distributions for MCMC models in JAGS. J Stat Softw. 2016; 71(9):1–25.
 23
Ntzoufras I, Vol. 698. Bayesian Modeling Using WinBUGS. Hoboken, NJ: John Wiley & Sons; 2011.
 24
Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, Rubin DB. Bayesian Data Analysis. Boca Raton, FL: CRC press; 2013.
 25
de Valpine P, Turek D, Paciorek CJ, AndersonBergman C, Lang DT, Bodik R. Programming with models: writing statistical algorithms for general model structures with nimble. J Comput Graph Stat. 2017; 26(2):403–13.
 26
Carpenter B, Gelman A, Hoffman MD, Lee D, Goodrich B, Betancourt M, Brubaker MA, Guo J, Li P, Riddell A. Stan: a probabilistic programming language. Grantee Submission. 2017; 76(1):1–32.
Acknowledgments
Not applicable.
Funding
EJC was partially funded, thorough a sabbatical stay at the National Autonomous University of Mexico, by the National Council of Science and Technology of Mexico (FORDECyT 265667), and the National Institute of Neurobiology and the International Laboratory for Human Genome Research both in Mexico.
Author information
Affiliations
Contributions
KC, EJC and RM performed all statistical analysis mentioned in the paper. SD provided clinical context and guidance. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
The study was approved by the ethics committee at the Colorado Multiple Institutional Review Board (COMIRB 133137), who waived the need for informed consent due to the retrospective nature of the data collection.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Additional file 1
Dynamic prediction based on variability of a longitudinal biomarker Supplemental Materials
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Campbell, K.R., Martins, R., Davis, S. et al. Dynamic prediction based on variability of a longitudinal biomarker. BMC Med Res Methodol 21, 104 (2021). https://doi.org/10.1186/s1287402101294x
Received:
Accepted:
Published:
Keywords
 Dynamic prediction
 Kidney transplant
 Survival