 Research article
 Open Access
 Open Peer Review
 Published:
Using an onsetanchored Bayesian hierarchical model to improve predictions for amyotrophic lateral sclerosis disease progression
BMC Medical Research Methodologyvolume 18, Article number: 19 (2018)
Abstract
Background
Amyotrophic Lateral Sclerosis (ALS), also known as Lou Gehrig’s disease, is a rare disease with extreme betweensubject variability, especially with respect to rate of disease progression. This makes modelling a subject’s disease progression, which is measured by the ALS Functional Rating Scale (ALSFRS), very difficult. Consider the problem of predicting a subject’s ALSFRS score at 9 or 12 months after a given timepoint.
Methods
We obtained ALS subject data from the Pooled Resource OpenAccess ALS Clinical Trials Database, a collection of data from various ALS clinical trials. Due to the typical linearity of the ALSFRS, we consider several Bayesian hierarchical linear models. These include a mixture model (to account for the two potential classes of “fast” and “slow” ALS progressors) as well as an onsetanchored model, in which an additional artificial datapoint, using time of disease onset, is utilized to improve predictive performance.
Results
The onsetanchored model had a drastically reduced posterior predictive meansquareerror distributions, when compared to the Bayesian hierarchical linear model or the mixture model under a crossvalidation approach. No covariates, other than time of disease onset, consistently improved predictive performance in either the Bayesian hierarchical linear model or the onsetanchored model.
Conclusions
Augmenting patient data with an additional artificial datapoint, or onset anchor, can drastically improve predictive modelling in ALS by reducing the variability of estimated parameters at the cost of a slight increase in bias. This onsetanchored model is extremely useful if predictions are desired directly after a single baseline measure (such as at the first day of a clinical trial), a feat that would be very difficult without the onsetanchor. This approach could be useful in modelling other diseases that have bounded progression scales (e.g. Parkinson’s disease, Huntington’s disease, or inclusionbody myositis). It is our hope that this model can be used by clinicians and statisticians to improve the efficacy of clinical trials and aid in finding treatments for ALS.
Background
Amyotrophic Lateral Sclerosis (ALS), is a rare neurodegenerative disease which exhibits extreme betweensubject variability. Progression of ALS is typically measured by the ALS Functional Rating Scale (known as the ALSFRS, or with additional respiratory questions, the revised ALSFRSR). The ALSFRS is a physicianreported outcome on a scale of 0 – 40 which grades common activities of daily living like dressing, eating, and walking. An ALSFRS score of 40 corresponds to normal function, and this score will decrease as the disease progresses. The ALSFRS, which is usually nonincreasing, has been shown to decrease in a linear fashion over the course of a typical clinical trial (6 months to 1 year) [1, 2], although the linearity is disputed over long periods of time [3].
Faster disease progression is consistently associated with lowered survival [2, 4,5,6,7,8], although many of the clinical measurements shown to be associated with survival (e.g. region of symptom onset and Riluzole use. Riluzole is the only FDAapproved drug for ALS) are not significantly associated with disease progression [9,10,11]. As rates of progression on the ALSFRS are often used in phase II and III clinical trials, more accurate predictive models would help researchers in improving trial efficiency. For purposes of imputation and adaptive trial simulation, it may be more desirable to consider prediction of the actual ALSFRS as an endpoint, rather than its slope. Furthermore, ALS patients and their doctors may also gain more utility out of predicting individual ALSFRS scores rather than slope.
Our aim was to develop a predictive Bayesian hierarchical model which could be used to predict individual ALSFRS scores after 1 year from trial beginning using at most the first 3 months of clinical trial data. Our baseline model is a Bayesian hierarchical linear model, which is similar to a linear mixed effects model. We then compared the predictive power of this baseline model to those provided by a Bayesian mixture model and a Bayesian onsetanchored hierarchical linear model. The onsetanchored model leverages an additional datapoint for each patient which assumes maximum ALSFRS score at the time of disease onset. Note that the approach of using an onsetanchor is applicable in modelling other diseases which utilize a bounded rating scale (Parkinson’s disease, Huntington’s disease, etc.). We additionally consider variable selection to improve model predictive accuracy, as well as consider model robustness when less than 3 months of data are available.
Methods
Study population
The datasets analyzed during this study are available in the Pooled Resource OpenAccess ALS Clinical Trials database (PROACT) (https://nctu.partners.org/ProACT/) [12]. In 2011, Prize4Life, in collaboration with the Northeast ALS Consortium, and with funding from the ALS Therapy Alliance, formed the PROACT Consortium. The data available in the PROACT Database has been volunteered by PROACT Consortium members. As of December 2015, PROACT had 4838 unique subjects, each having at least one reported ALSFRS or ALSFRSR score. As PROACT is a collection of data from clinical trials, we further subset this data to only include subjects that were receiving placebos. This resulted in 1301 subjects to be considered for analysis. One patient was later dropped due to having no data entered for selfreported disease onset time, bringing the final number of subjects to 1300. For more demographic information on these subjects, see Table 1.
For these 1300 subjects, we used ALSFRS scores to measure disease progression. The ALSFRS score is bounded between 0 and 40, and is typically nonincreasing. Patients with ALSFRSR scores, the revised ALSFRS, had their scores converted to the ALSFRS by summing the scores from the first nine questions of the ALSFRSR (which concern motor and bulbar function) as well as the score from the first respiratory question, R1: Dyspnea.
Model comparison
Our objective was to build a predictive model with which we could use the first 3 months of a subject’s data to determine their ALSFRS score at 1 year. As very few subjects had a measurement at exactly 1 year, we instead used the model to predict each subject’s first score after day 365, denoted as FRS_{365}. We chose to predict after 12 months because that is a commonly used endpoint in ALS trials (specifically, only 4 of 18 recent ALS trials had endpoints shorter than 1 year [13]; due to the linearity of the ALSFRS decline over timespans shorter than 1 year it stands to reason that a linear model which performs well at 12 months would perform well for shorter endpoints). Three months was chosen as the cutoff because: 1) this was the window used in the DREAM ALS Stratification Prize4Life Challenge [14]; 2) 3 months represented a reasonable amount of time for making 12 month predictions; and 3) is a time frame with utility for both adaptive trial designs and for imputing missing data. Ideally, this model would be accurate even when less than 3 months of subject data are available.
Large amounts of variability are inherently associated with any ALS model. Bayesian hierarchical models excel at capturing many sources of variability, which can then be reported via posterior predictive credible intervals. A credible interval is preferable for its interpretability: in the framework of a Bayesian model, there is a 95% chance that a subject’s FRS_{365} is within the 95% credible interval. Note that while the gold standard for confidence intervals is 90% or 95%, this is done to control the type I error rate. A credible interval, being a statement of probability, has no such restriction, and thus is useful with even lower credible levels, such as 70% or 80% [15].
We considered three types of models, which are described below: A Bayesian hierarchical linear model, a Bayesian hierarchical mixture model, and a Bayesian onsetanchored hierarchical linear model. Note that these models are all linear with respect to time. This is largely because a patient in PROACT typically has only one ALSFRS measurement per month, which causes more complicated models, such as 3parameter sigmoidal curves, to suffer from convergence problems. Linearity is also convenient because the slope parameter can be used as a simpletointerpret measure of the disease’s rate of progression.
The models were compared by the distribution of their posterior meansquareerror (MSE) resulting from a crossvalidation analysis. Crossvalidation entails splitting the data in to a training set with which to build the model, and a validation set with which to assess the model’s predictive power [15]. We looked at 10 randomlysampled validation sets for each model, and found the results off the crossvalidation to be very robust across the various training/validation splits. The posterior distribution of the MSE, denoted \( \overset{\sim }{MSE} \), is defined as follows: for each subject i, take the square of the difference between their true FRS_{365, i} and their posterior predictive distribution for FRS_{365, i}, denoted \( {\overset{\sim }{FRS}}_{365,i} \). Sum this over all subjects in the validation set, adjusting for the size of the validation set. In other words \( \overset{\sim }{MSE}=\sum \limits_{i=1}^n\frac{{\left({\overset{\sim }{FRS}}_{365,i} FR{S}_{365,i}\right)}^2}{n} \).
In order to be in the validation set, subjects needed at least one ALSFRS score after 1 year from baseline. Again, as the ALSFRS score at 1 year was not specifically observed for most patients, we instead predicted FRS_{365}, the subject’s first score after 365 days. Of subjects who had at least 1 year of data, average FRS_{365} was 386.7 days, with standard deviation of 23.7 days and maximum of 577 days. The same training and validation sets were used to validate all three models.
All analysis was done using R [16], OpenBUGS [17], and the R package R2openBUGS [18]. Pseudocode which describes the model in more detail is provided in the Additional file 1.
Bayesian hierarchical linear model
Since ALS seems to progress linearly over most of the 1 year time frames in the PROACT database, we started with a linear hierarchical Bayes mixed effects model with weak and uninformative priors. Specifically, the ALSFRS for subject i at time t is modeled as
Truncated to ALSFRS_{ i }(t) ∈ [0, 40], which is easily done in OpenBUGS. T_{3} denotes the centered nonstandardized tdistribution with 3 degrees of freedom and nonstandardized variance σ^{2}. Note that a standardized tdistribution with 3 degrees of freedom would instead have a variance of 1. Parameters a_{ i } and b_{ i } are the subjectspecific intercept and slope term. A tdistribution with 3 degrees of freedom was chosen because a normal distribution was too narrow in the tails. Additionally, we observed that the residuals from simple linear regression on subjects (with sufficient amounts of data) followed a T_{3} distribution extremely well (see Fig. 1). To further justify this, we also observed a massive decrease in model deviance information criteria (DIC) when using T_{3} versus the normal distribution. A reduction of more than 10 to DIC is typically associated with improved model fit; using the T_{3} over the normal resulted in a DIC reduction of over 1700.
Continuing the model description, the hyperparameters a_{ i } and b_{ i }, in turn, have the following distributions:
Where a_{ i } is truncated to a_{ i } ∈ [0, 40] and b_{ i } is truncated to b_{ i } ∈ (−∞, 0]. Weak priors from the literature and discussions with clinicians were assumed for p_{0} and p_{1}. Specifically CastrilloViguera et al. [19] reported that the ALSFRSR decline in one database is roughly − 0.92 units per month with standard error of 0.08. This translates to roughly an ALSFRS decline of −.025 per day, and leads us to the following priors:
Where the increased error in p_{1} allows for more strength in the analysis to come from the data. Generally subjects with low baseline ALSFRS scores are not enrolled in clinical trials, and the prior for p_{0} was chosen to reflect this while still allowing a wide range of potential starting ALSFRS values. Uninformative priors were assigned to the remaining variables: \( {\sigma}^2,{\sigma}_0^2, \) and \( {\sigma}_1^2 \) are each given the prior Γ^{−1}(0.001,1000), which is equivalent to \( \frac{1}{\sigma^2}\sim \Gamma \left(\mathrm{0.001,0.001}\right) \).
Such a Bayesian model, aside from the weakly informed priors on p_{ i }, was suggested by Gomeni et al. [20] . A key advantage to hierarchical modelling in this way is that it allows for shrinkage of error resulting from sample means [21, 22], and also lets subjects with fewer data points “borrow” information from the remaining population. The Bayesian analysis also has advantages with respect to interpretability (especially in a clinical setting). This model will be referred to as the “linear model”.
Bayesian hierarchical linear mixture model
A mixture model is useful when each subject belongs to one of several groups, each group having their own specific progression distributions. Specifically, Gomeni et al. [20], suggested that ALS subjects could be classified as either “fast” or “slow” progressors. To model this, we assume each subject is either a fast or slow progressor, and assume that each group has their own average rate of disease progression (parameterized by the mean of the subjectspecific slope). We further assume the slope parameter for fast progressors is strictly steeper (more negative) than those of slow progressors.
The ALSFRS for subject i at time t is still ALSFRS_{ i }(t)~T_{3}(a_{ i } + b_{ i }t, σ^{2}) truncated to ALSFRS_{ i }(t) ∈ [0, 40], but now we let \( {b}_i\sim N\left(\Lambda, \kern0.5em {\sigma}_1^2\right) \) truncated to b_{ i } ∈ (−∞, 0]. This starts the mixture process, with Λ being either Λ_{1} or Λ_{2} = (Λ_{1} + c), where c is a positive constant, with probability Pr(Λ = Λ_{ i }) = π_{ i }. Finally, we use the following priors: π_{ i }~Dirichlet(1, 1), \( {\Lambda}_1\sim N\left(0,{\sigma}_{\Lambda_1}^2\right) \). The error terms \( {\sigma}^2,{\sigma}_1^2,{\sigma}_0^2,{\sigma}_{\Lambda_1}^2 \) are all assigned uninformative priors of Γ^{−1}(0.001,1000) and we assign c~N(0,100) truncated to c ∈ [0, ∞) . All other priors and parameters are specified as in the linear model (2.2.1). This model shall be referred to as the “mixture model”.
Bayesian onsetanchored hierarchical linear model
This model resembles the linear model in structure, but uses an idea first introduced by Proudfoot et al. [23]. The idea was to create an additional artificial datapoint, referred to as the “onsetanchor”. We do this by assuming that each subject had an ALSFRS score of 40 (the maximum possible score) at their time of disease onset (see Fig. 2). Aside from this artificial data point, the parameters and model specification remain identical to those given in the linear model. This model is referred to as the “onsetanchored model”.
Assuming the maximum possible ALSFRS score at disease onset time was an idea first introduced by Proudfoot et al. [23]. They used this assumption to create a slope between the onset anchor and the first observed ALSFRS score, which was then used as a predictor for measuring a patient’s disease progression. Our onsetanchored model, however, treats this additional artificial datapoint as an observed value (specifically, a leverage point) in the modelling framework.
Considering the simplicity of this approach, the addition of a nonrandom leverage point to aid in model prediction is a surprisingly novel technique. This method will, however, result in a biased linear regression model: specifically we would not expect that the difference between the observed FRS_{365} and mean of the posterior predictive distribution of FRS_{365} to be zero on average (in other words \( E\left({\overset{\sim }{FRS}}_{365,i} FR{S}_{365,i}\right) \) is not necessarily zero). Recall that the SE of any prediction is composed of the sum of the square of the prediction bias and the prediction variance. In order for this biased model to predict FRS_{365} well, the reduction in prediction variance needs to dramatically outweigh the increase in prediction bias.
Covariate selection using the onsetanchored model
After choosing a “winner” from the three models mentioned above (the onsetanchored model), we wished to determine which clinical features, if any, improved predictive accuracy when used as covariates in the model. Clinical features considered were height, symptom onset time, sex, age, race, individual subquestions of the ALSFRS, forced vital capacity (FVC, both liters and percent predicted of normal), respiratory rate, weight, Riluzole use (yes/ no), and site of onset (bulbar/ limb). Many lab measurements are included in PROACT, yet due to their sparse nature, only lab features which were present in at least 90% of the subjects were considered. Albumin has been shown to be associated with ALS survival [24] and was included for analysis even though it was only present in 86% of subjects. The following lab features were considered in our analysis: chloride, serum aspartate aminotransferase (AST), glucose, sodium, blood urea nitrogen, potassium, bilirubin, alanine transaminase (ALT), creatinine, and albumin.
Many of these features were repeated measures. To use them as covariates, they were truncated to at most 3 months (for both the training and the testing set) and then collapsed to slope and intercept (baseline) measures. Specifically, we performed a linear regression on the feature with respect to time (truncated at 3 months), and extracted the ordinary least squares estimates for the slope and intercept. While true baseline data would be preferable over the ordinary least squares intercept estimator, baseline data was frequently not available. Therefore the ordinary least squares intercept estimator was chosen for homogeneity. Collapsing longitudinal predictors has been successfully employed in other ALS predictive models [25, 26], and greatly simplifies the modeling process. All features were normalized using their sample means and variances for ease of analysis and interpretability.
As we were more interested in predictive power, our criteria for feature selection was improvement to the average MSE resulting from predicting FRS_{365} in 100 replicates of cross validation using repeated random sub samples (Monte Carlo crossvalidation). This method was chosen rather than choosing covariates based on statistical significance as given by a small pvalue. Deviance information criterion (DIC) was also considered in assessing whether features improved the model or not.
The specifics of our covariate random subsampling cross validations are as follows: For the covariate of interest, a single replicate (of 100) first begins by randomly subsetting the overall data in to 300 subjects with nonmissing entries. While multiple imputation could be used here, we chose to only use complete cases to drastically reduce computation time as well as eliminate potential convergence problems. Of this subset, we randomly chose a validation and testing set (30 subjects in validation, 270 in testing), built onsetanchored models both using and not using the covariate, and compared the average difference in posterior MSE. This is a single replicate, and we repeat this 100 times for each covariate. We then analyzed the average effect of including the covariate over these 100 replicates (for each covariate). Specifically, we considered the two onsetanchored models (for the full model specification, see Additional file 1):
Covariate onsetanchored model:
Baseline (no covariate) onsetanchored model:
Where X_{ i } is the subjectspecific covariate, t_{ i } is time for subject i. The slope of subject i is b_{1i} which, in the covariate model, is a function depending on X_{ i }. Similarly, b_{0i} is the subjectspecific intercept. As per hierarchical modelling, we assume priors only for the hyperparameters p_{ jk } (j = 0, 1 and k = 0, 1). As per the linear model, the following weak priors were assumed:
Uninformative priors were assigned for the remaining parameters in both models: σ^{2} and \( {\sigma}_i^2 \) are given Γ^{−1}(0.001,1000); p_{01} and p_{11} are given N(0,100^{2}) (see Additional file 1).
Results
We investigated the predictive power of three types of Bayesian hierarchical models: linear, mixture, and onsetanchored. In a Bayesian framework, when crossvalidating a model, the resultant MSE has a posterior distribution which takes in to account all of the sources of variation within the model. Specifically, these sources of variation include 1) variation within the model; 2) variation of the posterior parameters; and 3) the variation of the posterior predictive distribution. Therefore it is important to not only lower the MSE but to also decrease its variance. Of the three models, the onsetanchored model not only had the smallest MSE but also had the MSE with the smallest variance (Fig. 3). Note that the DIC between the onsetanchored model and the standard linear model cannot be compared, because the additional datapoint in the onsetanchored model results in a different likelihood.
The MSE for the onsetanchored model is not only smaller in terms of expectation (In Fig. 3 the means of the MSE for the onsetanchored, mixture model, and linear model were 51.1, 68.5, and 73.7 respectively) but also has the smallest variance. We also considered a mixture model which utilized the additional datapoint given by the onsetanchor. This complex model performed about as well as the more parsimonious onsetanchored model, which can be seen by their nearly overlapping MSE distributions in Fig. 3.
Since from Fig. 1 shows that the onsetanchored model to be the winner, we decided to check the robustness of this result by repeated random subsampling cross validation (25 replications). We specifically compared the posterior mean MSE between the linear and onsetanchored models for randomly selected training/testing splits. On average, the onsetanchored model had a posterior mean MSE that was 20.7 (standard deviation 4.1) less than the linear model (a figure is provided in the Additional file 1).
We next attempted to find which covariates or features could consistently improve the MSE of the onsetanchored model, or decrease the DIC. While many clinical and lab predictors had nonzero effects on the posterior slope and intercept (meaning p_{11} and/ or p_{01} were nonzero), very few predictors consistently improved the MSE, and among those that did, the improvement to the MSE was very small (Table 2). Some variables, such as FVC: Subject Liters (slope) and FVC: Percent Normal (slope) reduced DIC (each reduced DIC by about 3.45), however they did not contribute towards a meaningful improvement in predictive power. Of the 53 covariates tested, only “disease onset time” resulted in an improvement to the MSE which was on average greater than 1%. This is most likely due to disease onset time giving a slight bias correction to the model. The next best covariates were subject’s 3month slope of FVC (in raw liters) and 3month slope of the first question from the ALSFRS: Q1, Speech.
Recall that several of these clinical values have been found to be associated with survival, including Forced Vital Capacity (FVC), age of onset, and site of onset (bulbar or limb, which can help differentiate subtypes of ALS). However, none of these covariates have been consistently useful for modelling ALSFRS progression [9], and this is consistent with our findings. Riluzole use, in particular, worsened MSE by a median of 0.09% (see Additional file 1 for expanded Table 2). Again, this is not surprising as Riluzole has only a weak effect on survival and has not been shown to be consistently associated with decreased disease progression [10, 27].
To appropriately predict the ALSFRS for a given subject after 1 year from trial onset using data collected up to 3 months after trial onset, a measure of uncertainty must be reported as well. Since a Bayesian analysis instead was performed, we can obtain 95% credible intervals for each subject’s predicted FRS_{365} (equivalently the posterior predictive interval for \( {\overset{\sim }{FRS}}_{365} \)). Figs. 4 and 5 give a sample of posterior distributions from a crossvalidation for nine randomlyselected subjects’ \( {\overset{\sim }{FRS}}_{365} \), as well as their 95% credible intervals and true FRS_{365} (the subject’s first score at, or after, 365 days). To further demonstrate the improved predictive power of the onsetanchored model, this is done for both the standard linear model (Fig. 4) as well was the onsetanchored model (Fig. 5). It can be noted that the credible intervals for the linear model are very wide, encompassing nearly the full range of the disease. As the time of data collection used to make the prediction increases from 3 months, this prediction becomes more accurate.
The performance of the onsetanchored model is vastly superior to that of the linear model when the length of time for data collection is short. Figure 6 shows that the onsetanchored model, using only baseline data, typically outperforms a linear model using many months of subject data. Figure 6 also shows that the onsetanchored model performs well even when the window for data capture is restricted less than 3 months, including when only a baseline measurement is available for each subject. Finally, it also shows that as the more data is used to build the prediction, the benefit of including an anchor decreases. These models do not include any longitudinal covariates that were tested previously, so we are not relying upon measurements that do not exist yet.
Recall that, while MSE of prediction is drastically reduced when using the onsetanchored model, it is in fact a biased model. The additional datapoint causes the model to typically underestimate the rate of disease progression, resulting in a higher predicted FRS_{365} than observed. Using the onsetanchored model resulted in a prediction bias of, on average, about 2 (on the ALSFRS scale). For comparison, the linear model was typically unbiased.
Finally, one way to measure progression of ALS is by the slope of the ALSFRS. An advantage to using the Bayesian hierarchical framework is that the ALSFRS slope for subject i, defined previously as b_{1i}, is specified in the model likelihood and therefore has a posterior distribution. Thus, one can then obtain a posterior estimate and credible interval for subject i^{′} s slope from this distribution. In other words, when using this model one can easily predict slope for a given subject in addition to FRS_{365}. Examples of the posterior predictive distributions for the ALSFRS slope using the onsetanchored model and 3 months of data, with 90% credible intervals, is provided in Fig. 7 for the same nine subjects used in Figs. 4 and 5. Also included in Fig. 7 are the posterior predictive slopes from the linear model. It should be noted that, on average, the MSE of predicting slope is smaller when using the onsetanchored model versus the linear model (using 3 months of data to predict slope at 1 year). As the onsetanchored model performs well even when using only baseline data, subject slopes could be predicted using this model as soon as a baseline ALSFRS score has been established.
Discussion
We explored three different Bayesian hierarchical predictive models with the goal of modelling ALS disease progression. These models were linear, mixture, and onsetanchored. The onsetanchored model, which uses an additional datapoint by assuming the maximum ALSFRS score at time of disease onset (e.g. 40), is the best model in terms of predictive accuracy via crossvalidation. This is especially noticeable when the window for data capture is very small, such as only using a baseline ALSFRS score.
While linear over the course of a typical clinical trial, progression of the ALSFRS could become curvilinear over long periods of time. This is further reinforced by the fact that it is bounded between 0 and 40, and is typically nonincreasing. Predictive models that attempt to account for this nonlinear progression suffer from a disparity between the number of subjectspecific datapoints and the necessary number of model parameters. We hypothesize that using the onsetanchor helps to “balance” this prediction (see Fig. 2), while also enabling shrinkage on the slope estimator. The result is a model that has reduced variability of parameter estimates (at the cost of a small increase in bias), which enables a large reduction in overall prediction MSE.
Using 3 months of subjects’ data, we found that very few clinical features improved prediction as measured by the MSE of repeated crossvalidation analysis. Among those features that did consistently improve the MSE, the improvement was rarely more than a 1% reduction. This corroborates findings by Creemers et al. [9] who found the quality of evidence among disease progression prognostic factors to be low at best. The covariate which offered the largest and most consistent improvement to the model’s prediction was disease onset time. As disease onset time is also a key part of the onsetanchor model, this stresses its importance, supporting other studies which have shown that onset time is strongly associated with disease progression as well as survival [5, 6, 25].
We found that the onsetanchored model performs well when predicting ALSFRS scores after 1 year even when less than 3 months of data is available. This is still true when the prediction is to be made for 6 months rather than 1 year; thus ALS trials which have endpoints earlier than 1 year can still benefit from the onsetanchored model. The predictive performance for 6 month prediction was proportional to that given in Fig. 6; when less than 2 months of data is available the onsetanchor model outperforms the linear model.
From a practical point of view, a model which only requires time of disease onset and at most 3 months of progression data eases both patient and clinician burden by requiring less overall measurements. The Bayesian modeling approach proposed here can help inform the design of adaptive studies, and be used as an imputation scheme to conduct trials more quickly [28,29,30]. Finally patients with ALS are routinely interested in charting their own progression, as well as trying interventions which might include treatments for spasticity or pain, or supplements geared towards slowing disease progression. In conjunction with a selfadministered ALSFRS, the onsetanchored model then becomes a predictive tool that an ALS patient can use aid them in tracking their disease and assess the utility of selfadministered interventions.
While the idea of using an additional data point as used by the onsetanchored model is simple, it is surprisingly novel. Assuming minimal disease progression at disease onset time (utilizing the upper bound of the ALSFRS) is a sort of intelligent imputation, but differs from traditional imputation in that we are not filling in missing data “gaps”. This is because none of the patients actually have an observed ALSFRS score at disease onset time.
Creating biased models to improve predictive MSE is not uncommon, and is used in ideas like fixedpoint regression or ridge regression. However, using an artificially created datapoint and treating it as observed data is something that, to the best of our knowledge, is something that has never been used before. We have found no literature where it is theoretically discussed or practically used. This methodology could be applied to any longitudinal data where the onset time of the process being modelled is known. Other diseases which have bounded rating scales which measure progression, including Parkinson’s disease or Huntington’s disease, might benefit tremendously from predictions that utilize an onsetanchor.
One limitation to the current study is that subjects who died before the clinical trial had progressed a full year were not candidates for cross validation, and hence did not directly contribute to the MSE. However, the Bayesian framework allows these subjects to be included in building the model, where their often increased rates of disease progression contribute to the variability of the model. Specifically, subjects who died prior to 1 year still contributed towards key model variables, including the distributions of rate of progression, effects of covariates, and variability measures throughout the model. Subjects who died prior to 1 year also had, on average, a lower predicted FRS_{365} than subjects who survived past 1 year. This is expected since a faster progression is associated with lowered survival.
Another limitation is the width of the posterior predictive distributions among individual subjects’ FRS_{365}. These distributions express a combination of variation within the model, variation of the posterior parameters, and variation of the posterior predictive distribution. Due to the heterogeneity of ALS, it is not unexpected that FRS_{365} can range widely at the individualpatient level. This will remain a limitation of any predictive model until better factors which are more strongly associated with disease progression (rather than survival) are discovered.
The onsetanchored model’s inherent bias is another limitation of the model. This is the typical concern with any biased linear model, but in this case we can see that the reduction in the onsetanchored model’s MSE is worth the tradeoff. Problems associated with the bias include the interpretation of the 95% posterior predictive intervals of FRS_{365}, (which are correct 73% of the time), as well as the underestimation of slope parameters. A possible solution might be to investigate a biascorrection term which would utilize diseaseonset time as well as the number of days after the start of the trial that is associated with FRS_{365} (such as including an overall error term to disease onset time).
One final limitation worth pointing out is that disease onset time, a critical feature of the onsetanchored model, is a problematic variable. This variable typically comes from patient memory, and as a result is subject to recall bias. Proudfoot et al. point out that while this bias exists, using patientrecalled onset time is still a useful predictor for disease progression [23], and this is corroborated by our model.
Continuing this last point, we attempted to incorporate the recall bias of disease onset time in the onsetanchored model by including a normally distributed random error term associated with onset time (with vague priors on the parameters). We compared the models with and without this random error (similar to our covariate analysis), and found that including this random error typically improved the MSE by 5.8% (with interquartile range of 4.2%). This improvement is especially impressive when one recalls that the “best” covariate (disease onset time) only improved the model MSE by 1.7%. However, we only observe this improvement when the error term is shared across all subjects; if each subject is given their own individual error term then the models perform exactly the same. This leads us to believe that this error term is serving as a biascorrection term to the model.
Conclusions
In this paper we considered the problem of predicting an ALS patient’s ALSFRS score at 1 year, given up to 3 months of data. Three different Bayesian hierarchical predictive models were considered: linear, mixture, and onsetanchored. The onsetanchored model, which leverages an additional artificial datapoint which assumes the maximum ALSFRS score of 40 at the patient’s time of disease onset, is the best model with respect to predictive accuracy under crossvalidation. The onsetanchored model is simple to implement, and is potentially applicable to various other diseases which measure progression by bounded rating scales.
The effect of many covariates (lab values, demographic information, etc.) on these predictions was assessed via repeated crossvalidation. The result is that time of disease onset is the only covariate which provides a consistent improvement to predictions, but this is a very small improvement. This highlights the urgent need to develop a better understanding of the mechanism behind ALS progression.
The onsetanchored model has an added benefit over the other models in that it allows predictions as early as directly after the baseline measure. In other words, as soon as the first ALSFRS measure is taken in a clinical trial, the model can be utilized for endpoint prediction of the ALSFRS. We hope this model can be used by clinicians and statisticians to improve the efficacy of clinical trials and aid in finding treatments for ALS.
Abbreviations
 ALS:

amyotrophic lateral sclerosis
 ALSFRS:

ALS Functional Rating Scale
 ALSFRSR:

ALS Functional Rating Scale – Revised
 MSE:

Mean squared error
 PROACT:

Pooled Resource OpenAccess ALS Clinical Trial Database
References
 1.
Armon C, Graves MC, Moses D, Forte DK, Sepulveda L, Darby SM, et al. Linear estimates of disease progression predict survival in patients with amyotrophic lateral sclerosis. Muscle Nerve. 2000;23(6):874–82.
 2.
Magnus T, Beck M, Giess R, Puls I, Naumann M, Toyka KV. Disease progression in amyotrophic lateral sclerosis: predictors of survival. Muscle Nerve. 2002;25(5):709–14.
 3.
Gordon PH, Cheng B, Salachas F, Pradat PF, Bruneteau G, Corcia P, et al. Progression in ALS is not linear but is curvilinear. J Neurol. 2010;257(10):1713–7.
 4.
Ikeda K, Hirayama T, Takazawa T, Kawabe K, Iwasaki Y. Relationships between disease progression and serum levels of lipid, urate, creatinine and ferritin in Japanese patients with amyotrophic lateral sclerosis: a crosssectional study. Intern Med. 2012;51(12):1501–8.
 5.
Kimura F, Fujimura C, Ishida S, Nakajima H, Furutama D, Uehara H, et al. Progression rate of ALSFRSR at time of diagnosis predicts survival time in ALS. Neurology. 2006;66(2):265–7.
 6.
Kollewe K, Mauss U, Krampfl K, Petri S, Dengler R, Mohammadi B. ALSFRSR score and its ratio: a useful predictor for ALSprogression. J Neurol Sci. 2008;275(12):69–73.
 7.
Pastula DM, Coffman CJ, Allen KD, Oddone EZ, Kasarskis EJ, Lindquist JH, et al. Factors associated with survival in the National Registry of veterans with ALS. Amyotroph Lateral Scler. 2009;10(56):332–8.
 8.
Zach N, Ennist DL, Taylor AA, Alon H, Sherman A, Kueffner R, et al. Being PROACTive: what can a clinical trial database reveal about ALS? Neurotherapeutics. 2015;12(2):417–23.
 9.
Creemers H, Grupstra H, Nollet F, van den Berg LH, Beelen A. Prognostic factors for the course of functional status of patients with ALS: a systematic review. J Neurol. 2015;262(6):1407–23.
 10.
Mandrioli J, Biguzzi S, Guidi C, Sette E, Terlizzi E, Ravasio A, et al. Heterogeneity in ALSFRSR decline and survival: a populationbased study in Italy. Neurol Sci. 2015;36(12):2243–52.
 11.
Watanabe H, Atsuta N, Nakamura R, Hirakawa A, Watanabe H, Ito M, et al. Factors affecting longitudinal functional decline and survival in amyotrophic lateral sclerosis patients. Amyotroph Lateral Scler Frontotemporal Degener. 2015;16(34):230–6.
 12.
Atassi N, Berry J, Shui A, Zach N, Sherman A, Sinani E, et al. The PROACT database: design, initial analyses, and predictive features. Neurology. 2014;83(19):1719–25.
 13.
Mitsumoto H, Brooks BR, Silani V. Clinical trials in amyotrophic lateral sclerosis: why so many negative trials and how can trials be improved? Lancet Neurol. 2014;13(11):1127–38.
 14.
Zach N, Kueffner R, Atassi N, Chio A, Cudkowicz M, Hardiman O, et al. The ALS Stratification PrizeUsing the Power of Big Data and Crowdsourcing for Catalyzing Breakthroughs in Amyotrophic Lateral Sclerosis (ALS) (P5.102). Neurology. 2016;86(16 Supplement).
 15.
Gelman A. Bayesian data analysis. Third edition. Ed. Boca Raton: CRC Press; 2014.
 16.
R Development Core Team. R: a language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2016.
 17.
Lunn DJ, Thomas A, Best N, Spiegelhalter D. WinBUGS  a Bayesian modelling framework: concepts, structure, and extensibility. Stat Comput. 2000;10(4):325–37.
 18.
Sturtz S, Ligges U, Gelman A. R2WinBUGS: A Package for Running WinBUGS from R. J Stat Softw. 2005;12(3):16.
 19.
CastrilloViguera C, Grasso DL, Simpson E, Shefner J, Cudkowicz ME. Clinical significance in the change of decline in ALSFRSR. Amyotroph Lateral Scler. 2010;11(12):178–80.
 20.
Gomeni R, Fava M. Pooled resource openaccess ALSCTC. Amyotrophic lateral sclerosis disease progression model. Amyotroph Lateral Scler Frontotemporal Degener. 2014;15(12):119–29.
 21.
Morris CN, Lysy M. Shrinkage Estimation in Multilevel Normal Models; 2012. p. 115–34.
 22.
Stein C, editor Inadmissibility of the Usual Estimator for the Mean of a Multivariate Normal Distribution. Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Contributions to the Theory of Statistics. Berkeley, Calif.: University of California Press. 1956
 23.
Proudfoot M, Jones A, Talbot K, AlChalabi A, Turner MR. The ALSFRS as an outcome measure in therapeutic trials and its relationship to symptom onset. Amyotroph Lateral Scler Frontotemporal Degener. 2016;17(56):414–25.
 24.
Chio A, Calvo A, Bovio G, Canosa A, Bertuzzo D, Galmozzi F, et al. Amyotrophic lateral sclerosis outcome measures and the role of albumin and creatinine: a populationbased study. JAMA Neurol. 2014;71(9):1134–42.
 25.
Hothorn T, Jung HH. RandomForest4Life: a random Forest for predicting ALS disease progression. Amyotroph Lateral Scler Frontotemporal Degener. 2014;15(56):444–52.
 26.
Kuffner R, Zach N, Norel R, Hawe J, Schoenfeld D, Wang L, et al. Crowdsourced analysis of clinical trial data to predict amyotrophic lateral sclerosis progression. Nat Biotechnol. 2015;33(1):51–7.
 27.
Shamshiri H, Fatehi F, Davoudi F, Mir E, Pourmirza B, Abolfazli R, et al. Amyotrophic lateral sclerosis progression: IranALS clinical registry, a multicentre study. Amyotroph Lateral Scler Frontotemporal Degener. 2015;16(78):506–11.
 28.
Gajewski BJ, Berry SM, Quintana M, Pasnoor M, Dimachkie M, Herbelin L, et al. Building efficient comparative effectiveness trials through adaptive designs, utility functions, and accrual rate optimization: finding the sweet spot. Stat Med. 2015;34(7):1134–49.
 29.
Rosenblum M, Luber B, Thompson RE, Hanley D. Group sequential designs with prospectively planned rules for subpopulation enrichment. Stat Med. 2016;35(21):3776–91.
 30.
Shan G, Wilding GE, Hutson AD, Gerstenberger S. Optimal adaptive twostage designs for early phase II clinical trials. Stat Med. 2016;35(8):1257–66.
Acknowledgements
We would like to thank all the patients with ALS and their family members who were the impetus for this study.
Funding
Alex Karanevich was supported by the Institute for Neurological Discoveries, the University of Kansas Medical Center Department of Biostatistics, and the Mabel A. Woodyard Fellowship in Neurodegenerative Disorders, a gift to the University of Kansas Endowment from the estate of the late Mabel Woodyard, who succumbed to progressive supranuclear palsy in 2008. Jeffrey Statland’s work on the project was supported by a NCATS grant awarded to the University of Kansas Medical Center for Frontiers: The Heartland Institute for Clinical and Translational Research # KL2TR000119, and by a Clinical Research in ALS and Related Disorders for Therapeutic Development (CReATe) Consortium Research Fellowship.
Availability of data and materials
The datasets generated and/or analyzed during the current study are available in the Pooled Resource OpenAccess ALS Clinical Trials database (PROACT), (https://nctu.partners.org/ProACT/) [12].
Contributors
The Pooled Resource OpenAccess ALS Clinical Trials Consortium: Data used in the preparation of this article were obtained from the Pooled Resource OpenAccess ALS Clinical Trials (PROACT) Database. As such, the following organizations and individuals within the PROACT Consortium contributed to the design and implementation of the PROACT Database and/or provided data, but did not participate in the analysis of the data or the writing of this report: Neurological Clinical Research Institute, MGH; Northeast ALS Consortium; Novartis; Prize4Life; Regeneron Pharmaceuticals, Inc.; Sanofi; Teva Pharmaceutical Industries, Ltd.
Author information
Affiliations
Contributions
AK cleaned the data, explored various modelling approaches, and compiled all figures and Tables. JS offered clinical insight to ALS, helped inform the priors in our models, and contributed several major revisions to the manuscript. BG offered advice on all statistical approaches and modelling methodologies, and contributed revisions to the manuscript. JH verified validity of statistical approaches and offered advice in model selection, and contributed revisions to the manuscript. All authors read and approved the final manuscript
Corresponding author
Correspondence to Alex G. Karanevich.
Ethics declarations
Ethics approval and consent to participate
In all of the trials that generated the data included in this database, study protocols were approved by the participating medical centers and all participating patients gave informed consent. Deidentified data from these trials were donated to the PROACT database for research purposes only and under the explicit conditions that Prize4Life and all users of the data would maintain the anonymity of subjects and not attempt to discover the identity of any subject. In the rare cases where donated data was not already completely anonymized, donated data was further anonymized following the HIPAA deidentification conventions for personal health information: any potential patient initials and/or dates of birth were removed, new randomized subject numbers were created, and wherever possible, trialspecific information was removed in the merging of datasets, including trial center identity and location, trial dates, or other identifying information. This project was deemed not to constitute human subjects research and the need for ethical approval for this work was waived by the University of Kansas Medical Center’s Human Research Protection Program.
Consent for publication
Not applicable
Competing interests
Jeffrey M. Statland is a consultant for aTyr, Acceleron, Fulcrum, Regeneron, and Strongbridge. The remaining authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Additional file
Additional file 1:
Full model descriptions, psuedocode, and full covariate table. (DOCX 18 kb)
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Received
Accepted
Published
DOI
Keywords
 ALS
 Hierarchical modelling
 ALSFRS
 Prediction
 Proact
 Onsetanchor