Skip to main content

Disease progression model anchored around clinical diagnosis in longitudinal cohorts: example of Alzheimer’s disease and related dementia



Alzheimer’s disease and related dementia (ADRD) are characterized by multiple and progressive anatomo-clinical changes including accumulation of abnormal proteins in the brain, brain atrophy and severe cognitive impairment. Understanding the sequence and timing of these changes is of primary importance to gain insight into the disease natural history and ultimately allow earlier diagnosis. Yet, modeling changes over disease course from cohort data is challenging as the usual timescales (time since inclusion, chronological age) are inappropriate and time-to-clinical diagnosis is available on small subsamples of participants with short follow-up durations prior to diagnosis. One solution to circumvent this challenge is to define the disease time as a latent variable.


We developed a multivariate mixed model approach that realigns individual trajectories into the latent disease time to describe disease progression. In contrast with the existing literature, our methodology exploits the clinical diagnosis information as a partially observed and approximate reference to guide the estimation of the latent disease time. The model estimation was carried out in the Bayesian Framework using Stan. We applied the methodology to the MEMENTO study, a French multicentric clinic-based cohort of 2186 participants with 5-year intensive follow-up. Repeated measures of 12 ADRD markers stemmed from cerebrospinal fluid (CSF), brain imaging and cognitive tests were analyzed.


The estimated latent disease time spanned over twenty years before the clinical diagnosis. Considering the profile of a woman aged 70 with a high level of education and APOE4 carrier (the main genetic risk factor for ADRD), CSF markers of tau proteins accumulation preceded markers of brain atrophy by 5 years and cognitive decline by 10 years. However we observed that individual characteristics could substantially modify the sequence and timing of these changes, in particular for CSF level of A\(\beta _{42}\).


By leveraging the available clinical diagnosis timing information, our disease progression model does not only realign trajectories into the most homogeneous way. It accounts for the inherent residual inter-individual variability in dementia progression to describe the long-term anatomo-clinical degradations according to the years preceding clinical diagnosis, and to provide clinically meaningful information on the sequence of events.

Trial registration, NCT01926249. Registered on 16 August 2013.

Peer Review reports


Alzheimer’s disease and related dementia (ADRD) are characterized by progressive changes in multiple anatomo-clinical domains including decline in one or several cognitive functions (such as memory, language and executive function) leading to clinical dementia, functional dependency and death [1]. Alzheimer’s disease (AD) neuropathology is identified by the abnormal accumulation of proteins that form amyloid plaques and tau neurofilaments in the brain [2]. It is well established that brain vascular pathology contributes to cognitive impairment and dementia [3]. Especially small vessel disease (causing white matter lesions and silent brain infarcts) could double the risk of clinical dementia [4, 5]. Progressive atrophy of some brain regions, more specifically the hippocampus [6] or the medial temporal lobe [7], due to neurons deaths, was also highlighted to contribute to higher dementia risk. A decade ago, an hypothetical model of disease progression was proposed [8] to temporally order these progressive changes. It postulates that amyloid and tau are involved in cellular mechanisms of protein deposition that would induce later neuronal dysfunction and brain structures atrophy. Decline in cognitive functions then appears as a result of the loss of neural tissues. Several studies found evidence supporting the initiating role of the amyloid protein on these pathological changes [9,10,11], but these results were obtained among study participants at distinct clinical stages and no study provided a clear understanding of the anatomo-clinical changes over the entire natural history of the disease. In addition, the initial model completely ignored vascular contribution to cognitive impairment [8].

Modeling disease progression from cohort studies is statistically straightforward in many diseases using the mixed model theory for instance [12, 13], but it faces a fundamental statistical challenge in ADRD. Indeed, since neuropathological changes likely occur 15 to 25 years before any clinical diagnosis can be reached [10, 14], confirming the sequence and timing of the associated neuropathological changes would require a follow-up of more than 20 years before diagnosis of clinical dementia which is only possible in population-based cohorts recruiting persons before middle age. Yet markers of neuropathological changes are mainly collected in clinical cohorts in which repeated measures of the most recent brain magnetic resonance imaging (MRI), brain positron emission tomography scanner (PET scan) and cerebrospinal fluid (CSF) derived biomarkers of ADRD before clinical diagnosis can be set up. This is the case with the MEMENTO cohort, a french nationwide clinic based study with 2323 participants followed up for 5 years, that gathers clinical examination, amyloid and tau biomarkers from cerebropsinal fluid, multiple brain images from MRI and PET scans (amyloid and glucose), and a neuropsychological tests battery. As illustrated in Fig. 1 (C) from the MEMENTO cohort data, modeling trajectories of the different markers according to the time to diagnosis in clinical ADRD cohorts usually both limits the analysis to the ultimate stages of the disease and reduces the sample size since most participants are not followed-up for more than 5 years (e.g. [15, 16]). Alternative timescales, such as time since inclusion or chronological age (see individual markers trajectories in Fig. 1 (A) and (B), respectively), do not solve this temporal challenge for describing disease progression. Indeed, time since inclusion does not have any biological meaning, covers only a short period and is very heterogeneous because participants are included at different clinical stages. Although much more relevant in research on ADRDs and age-related disorders, chronological age still induces too much inter-individual heterogeneity as people do not age similarly and ADRD onset may arise at various ages.

Fig. 1
figure 1

Individual trajectories of 12 markers in the MEMENTO Cohort, France, 2011-2019 (2186 participants, 286 incident cases of dementia) according to 4 different timescales: (A) time in the study (or follow-up time), (B) age, (C) time to diagnosis (available only for the incident cases) and (D) the latent disease time estimated using the disease progression anchored model

In the absence of a relevant completely-observed timescale, latent disease progression models have been developed with the aim to directly retrieve the unobserved disease time from the data. These data-driven methods usually consist in re-aligning the participants trajectories according to the unobserved disease time by assuming that participants experience overall the same disease progression. After a first methodology proposed by Jedynak et al. [17] to estimate a continuous disease time and describe the long-term progression of the biomarkers, many approaches have been developed and improved. Re-alignment of the individual trajectories into the disease progression time scale is managed using time-warping functions, through either an individual-specific time-shift [18,19,20,21,22,23], or an individual time-shift combined with individual rate of progression [17, 24,25,26,27] or the definition of an exponential progression score [28]. Initially estimated individual by individual [17], most techniques now estimate the time-warping functions using random effects, thus entering the framework of nonlinear mixed models [19,20,21,22,23, 27]. Beyond the use of time-warping functions, the different models also varied according to the type of information on participants considered (biological samples, brain images and/or neuropsychological evaluations) and according to the specification of the trajectories with sigmoid or exponential functions applied to Gaussian markers in their natural scales or after percentile transformations combined with linear mixed models.

Despite the rise of disease progression models based on a latent disease time, none of the techniques directly considered the partially-observed information provided by the clinical dementia diagnosis. Yet, when based on a clinical expertise, diagnosis represents a landmark in the natural history of the disease that could help anchor the definition of the latent disease time along the actual ADRD process. We thus propose in this work a latent disease time model that directly incorporates the partially-observed diagnosis information to re-align the individual trajectories along the ADRD disease time. This Disease Progression Anchored Model (DPAM) is an extension of the latent time joint mixed effect model (LTJMM) developed by Li et al. (2017) [19] and estimated in the Bayesian framework using Stan. The methodology was applied to describe the progression of 12 biomarkers including markers of AD neuropathology, small vessel disease, brain atrophy and cognitive functioning in the French clinic-based MEMENTO study.


The MEMENTO cohort

The MEMENTO cohort is a clinic-based study that recruited consecutively 2323 participants between April 2011 and June 2014 within the French national network of university-based memory clinics (Centres de Mémoires de Ressources et de Recherche [CMRR]). Participants were followed-up every 6 to 12 months during 5 years. Inclusion criteria required that participants were not demented, had a clinical dementia rating (CDR) \(\le\) 0.5, and performed 1 standard deviation worse than the subject’s own age, sex and educational-level group mean in one or more cognitive functions (from neuropsychological tests performed within 6 months preceding the screening phase). Participants with isolated subjective complaints were also eligible if aged 60 years or older. This study was performed in accordance with the Declaration of Helsinki. All participants provided written informed consent. The MEMENTO cohort protocol was approved by an ethics committee (“Comité de Protection des Personnes Sud-Ouest et Outre Mer III”; approval number 2010-A01394-35) and was registered in (Identifier: NCT01926249).

The study protocol is described in details in [16]. Every year, participants underwent a clinical evaluation that included an extensive battery of neuropsychological tests. Suspected cases of dementia during follow-up were reviewed by an independent expert committee and final clinical dementia diagnoses were established. At inclusion and at 24 months follow-up, all the patients were invited to undergo cerebral MRI, a 18F-fluorodeoxyglucose PET (FDG-PET) brain scan, and to have a lumbar puncture. The analytical sample consisted of the 2186 participants with at least one measure for one biomarker during the follow-up and without missing information on risk factors of interest in this work i.e. age, sex, education years and APOE status (apolipoproteine E gene).

Markers of ADRD

Repeated measures of 12 markers of AD neuropathology, or small vessel disease, or brain atrophy or cognitive functioning were analyzed.

Biomarkers of AD neuropathology

The three markers of AD neuropathology were the amyloid-\(\beta\)42 peptide (A\(\beta\)42), total tau (t-tau), and phosphorylated tau (p-tau181) measured from CSF using the standardized commercially available INNOTEST sandwich enzyme-linked immunosorbent assay (Fujirebio, Ghent, Belgium).

Biomarkers of brain atrophy

The markers of brain atrophy were cortical thickness in three regions associated with ADRD progression (middle temporal, enthorhinal, fusiform)[29] and hippocampal volume respectively measured from MRI T1-weighted with FreeSurfer[30] and SACHA[31]. Hippocampal volume was relative to the total intracranial volume. Glucose metabolism, a marker of neuronal loss, was measured by the mean FDG-PET uptake in AD-specific regions expressed as standard uptake value ratios (SUVr) [32, 33].

Biomarker of small vessel disease

We used white matter lesions volumes as a marker of small vessel disease. MRI 2D-T2 FLAIR sequences analysis allowed to assess white matter hyperintensities (WMH) volume using an automated and validated method [34].

Cognitive assessment

Assessment results of three cognitive functions commonly impaired with the disease progression [35] were included:

  • episodic memory with the sum score of the 3 free recalls from the free and cued selective reminding test (FCSRT), a French adaptation of the Grober and Buschke test [36].

  • semantic verbal fluency with the number of animals cited in 120 seconds [37].

  • executive functions with the number of correct moves per seconds at the trail making test A (TMT-A) [38].

Disease progression anchored model

Let’s consider K markers measured repeatedly over time. They are denoted \(Y_{ijk}\) for the value of marker k (\(k = 1, ..., K\)) for subject i (\(i = 1, ..., N\)) at time \(t_{ijk}\), with j the occasion (\(j = 1, ..., n_{ik}\)). Time t is the fully observed timescale: age or time since entry in the study in our case. The DPAM is defined in three steps: (i) define the latent disease time from the observed timescale, (ii) define a comparable scale for all the markers, (iii) define the multivariate mixed model for the marker trajectories according to the latent disease time. We describe each step below.

Latent disease time definition

A latent disease time can be generically defined as an individual-specific monotonic function \(s_i(t)\) of the observed time t. In our framework, the latent disease time corresponds to the actual time since clinical diagnosis had it been made in continuous time. By denoting \(T^*_i\) the actual unobserved time of clinical dementia, the latent disease time is defined as:

$$\begin{aligned} s_i(t) = t - T^*_i \end{aligned}$$

This definition assumes there is no distortion of time between t and s. The time in the disease s is a shift of the observed time t so that it is anchored to the actual time of clinical dementia: \(s_i(T^* _ i )) = 0\). We assume that the actual time of clinical dementia is a latent variable with generic distribution \(\mathcal {D}\). In the main analysis, we considered for instance a lognormal distribution: \(\text {ln}(T^*_i) \sim {\mathcal {N}}(\mu _T ,\sigma _T ^{2})\) which handles the positivity of \(T^*_i\) and allows for a potential long tail in clinical dementia timings [39]. Without additional properties, the definition of this latent time shift is very standard and unrelated to the prior knowledge we may have about the time to clinical dementia.

In cohorts that focus on ADRD risk factors and natural history, only a part of the participants is diagnosed with clinical dementia during the follow-up, either because some participants dropped out the study or died free of clinical dementia diagnosis or were free of clinical dementia diagnosis by the end of the planned follow-up. However, whether a participant was diagnosed with clinical dementia or not, valuable information can be leveraged to anchor the latent time \(T^* _ i\) to the actual time of clinical dementia.

Prior knowledge on the diagnosis

The clinical stage of the participants at the time of diagnosis is relatively homogeneous since clinical dementia is diagnosed by an independent committee of experts in the context of clinic based cohorts with intensive follow-up. Thus, time of clinical dementia diagnosis provides a reliable anchor time.

Let \(D_i\) be the indicator that the participant had a confirmed diagnosis of clinical dementia during follow-up. For participants diagnosed with clinical dementia (\(D_i = 1\)) the observed diagnosis time \(T^{\text {diag}}_i\) is likely in the neighbourhood of the actual unobserved time \(T^* _ i\). For participants who dropped out free of clinical dementia (\(D_i=0\)), the time at the last clinical evaluation \(T^{\text {last}}_i\) is likely to be smaller than the actual unobserved disease time \(T^* _ i\). This was translated into the following constraints:

$$\begin{aligned}{} & {} T^{*}_{i} > T^{\text {last}} - \epsilon _{L}\ \text {for}\ D_{i} = 0\\{} & {} T^{\text {diag}} - \epsilon _{L}< T^{*}_{i} < T^{\text {diag}} + \epsilon _{U}\ \text {for}\ D_{i} = 1 \end{aligned}$$

where \(\epsilon _L\) and \(\epsilon _U\) are fixed scalars translating the lack of accuracy around the clinical evaluation and the diagnosis of clinical dementia. They have to be determined according to the study protocol and frequency of clinical evaluations.

Severity scale and comparison of markers

Independently from timescale definition, describing and comparing the sequence and speed of degradation across markers induces an additional challenge. Each marker has its own scale, and some, such as psychometric tests, are not necessarily Gaussian. Following previous works [18,19,20], we used a 2-step data-driven approach to transform the raw markers data \(Y_{ijk}\) into:

  1. 1

    percentiles \(P_{ijk}=F_{Y_k}(Y_{ijk})\) (\(P_{ijk} \in [0,1]\)) using the empirical cumulative density function \(F_{Y_k}\) to define a common severity scale from 0 (minimum value) to 1 (maximum value) on which the sequence of markers’ impairments could be compared. Note that markers were flipped when necessary, so that higher values systematically indicated higher impairment (0 = best condition observed and 1 = worst condition observed).

  2. 2

    normalized values \(\widetilde{Y}_{ijk}= \Phi ^{-1}(P_{ijk})\) using the inverse of the Gaussian cumulative distribution function \(\Phi ^{-1}\) to apply multivariate linear mixed models for normal dependent variables \(\widetilde{Y}\).

Ideally, the severity scale should translate equi-distributed levels of impairments with 0.5 corresponding to a medium impairment. Yet, percentiles obtained with the empirical cumulative distribution function \(F_{Y_k}\) are sample-dependent. In the case of the MEMENTO cohort for instance, most patients remain at a very early clinical stage so that 0.5 severity would still correspond to an early stage, and a medium impairment would likely be at the highest percentiles of the distribution. The severity scale with equi-distributed levels of impairments was retrieved by reweighting marker measures according to the participant clinical stage at entry. The function \(F_{Y_k}\) was replaced by the weighted cumulative density function (implemented in the Hmisc R package [40]) with each weight computed from the inverse proportion of observation in the clinical stage. Predictions were computed in the percentile scale using the back-transformation \(P_{ijk} =\Phi (\widetilde{Y}_{ijk})\).

Multivariate linear mixed effects model

We described the marker trajectories in the normalized scale \(\widetilde{Y}_{ijk}\) according to the latent disease time \(s_i(t_{ijk})\) using the following multivariate linear mixed model:

$$\begin{aligned} \widetilde{Y}_{ijk} = \varvec{F}(s_i(t_{ijk}))^\top \varvec{\beta }_k + \varvec{X}_{i}(t_{ijk}) \varvec{\gamma }_k + \varvec{F}(s_i(t_{ijk}))^\top \varvec{u_{ik}} + \varepsilon _{ijk} \end{aligned}$$

where \(\varvec{F}\) is a basis of time functions defining the shape of the trajectory according to the disease time. Associated with \(\varvec{\beta }_{k}\), it gives the mean trajectory of normalized marker k (for the reference profile of covariates). \(\varvec{X}_{i}(t_{ijk})\) are adjustment covariates associated with fixed effects \(\varvec{\gamma }_k\) and \(\varepsilon _{ijk}\) are the independent Gaussian error of measurement with marker-specific variance \(\sigma _{\varepsilon _k}^2\). Finally \(\varvec{u}_{ik}\) are the individual-and-marker-specific random effects defining the individual departure from the marker-mean trajectory. We assumed \(\varvec{u}_{ik} \sim \mathcal {N}(0,\varvec{B}_k)\) with \(\varvec{B}_k\) an unstructured variance covariance matrix. Random effects and errors are assumed independent. In addition, we assumed that the markers-specific random deviations were independent across markers so that the latent time-shift captured the inter-markers correlation.

DPAM specification for the MEMENTO cohort

Four clinical stages were defined in the MEMENTO cohort from the CDR-SB (CDR sum of the boxes) score at entry in the study. Each individual weight was then computed as a quarter of the inverse proportion of the clinical stage, thus ensuring that the sum of the weights equals N: CDR-SB = 0, N = 784, \(w_i\) = 0.697; CDR-SB = 0.5, N = 794, \(w_i\) = 0.688; CDR-SB = 1, N = 323, \(w_i\) = 1.692; and CDR-SB > 1, N = 285, \(w_i\) = 1.918.

We considered in the application a linear marker-specific trajectory (\(\varvec{F}(s) = (1, s)^\top\)) and constrained \(\varvec{\beta }_k \ge 0\) to impose a mean degradation over time for all the biomarkers. We also included a random slope only for the neuropsychological tests. This was to prevent any numerical non-identifiability issues for MRI and CSF markers where a maximum of two measures was collected. In addition, we considered as adjustment covariates: age, sex, years of education and APOE4 status. Finally, we added an indicator of first visit for the neuropsychological tests to correct for the first passing effect [41].

Given clinical dementia diagnoses were performed every 6 months in the cohort, we set the lack of accuracy around clinical evaluation to \(\epsilon _L = \epsilon _U = 1.5\) years in the main analysis.

Estimation procedure

The estimation of our disease progression model was done in the Bayesian framework using Hamiltonian Monte Carlo No-U-turn sampling algorithm (HMC-NUTS) [42] to approximate the posterior distribution of the parameters with Markov Chain Monte Carlo (MCMC). We used Stan software (version 2.20.0) [43, 44] through the CmdStan interface with parallel computations on both the chains and the individuals. A commented version of our program, freely adapted from LTJMM [19], is available at

Prior distributions

We considered standard weakly-informative priors for the multivariate mixed model parameters in equation (2) with for all \(k=1,...,K\): each element of \(\varvec{\beta }_k\) and \(\varvec{\gamma }_k\) following \(\mathcal {N}(0,10^2)\) (with \(\varvec{\beta }_k\) imposed to be positive), and \(\sigma _{\varepsilon _k}\) and the variances of the random-effects \(\varvec{u}_{ik}\) following \(\text {half-Cauchy}(0,2.5)\). For the latent disease time, we assumed the following distribution to incorporate the \(\epsilon _L\) constraint and allow for negative \(T^*_i\): \(\text {ln}(T^*_i + \epsilon _L) \sim \mathcal {N}(\mu _{T\epsilon },\sigma _{T\epsilon })\) with \(\mu _{T\epsilon } \sim \mathcal {N}(10,10^2)\), and \(\sigma _{T\epsilon } \sim \text {half-Cauchy}(0,2.5)\).

Posterior summaries

We ran 4 chains of 6000 iterations burn-in and 2000 iterations for sampling, and we retained 1 iteration every 4 iterations to avoid auto-correlation issues from consecutive samples. Thus we approximated the posterior distribution with D=2000 iterations (500 by chain) and reported posterior means and 95% confidence intervals (95%CI) of the parameters.

Diagnostic checks

Diagnostic tools of Stan were used to evaluate the estimation procedure: convergence of the MCMC with the Gelman and Rubin [45] potential scale reduction statistic \(\hat{R}\) which compares variances between and within chains and effective sample size ratio (ESS) [46] which estimates sample size without any auto-correlation. These indicators were considered as satisfied if \(\hat{R} <1.05\) and \(ESS/D \ge 0.1\)) for all parameters of the model.

Sensitivity analyses

We assessed the influence of our definition of the latent disease time and the associated constraints in sensitivity analyses. Specifically, we compared our DPAM using the actual dementia time defined according to a lognormal distribution and the use of prior information on the observed diagnoses times with the non-anchored disease progression model specification in which \(T_i^*\) followed a Gaussian distribution without any constraint. We also evaluated the stability of the results when considering weaker constraints (\(\epsilon _L =\epsilon _U=3\) year) to guide the estimation of the disease time. The comparison was based on the residual root mean squared error (RMSE).

Predictions of the biomarkers’ mean trajectories in the severity scale

A central output of this methodology is the description of biomarkers mean trajectories according to latent disease time s. Let define \(P_{ik} (s_i(t_{ijk})) = P_{ijk}\) and \(\widetilde{Y}_{ik} (s_i(t_{ijk})) = \widetilde{Y}_{ijk}\). The mean trajectory of biomarker k for a covariate profile \(\varvec{x}\) (independent of time for simplicity) according to latent disease time s in the severity scale is:

$$\begin{aligned} \mathbb {E}(P_{ik} (s) |_{X_i(t)=\varvec{x}}) = \int \Phi (\widetilde{y}) f_{\widetilde{Y}(s)|_{X_i(t)=\varvec{x}}} (\widetilde{y})d\widetilde{y} \end{aligned}$$

where \(f_{\widetilde{Y}(s)|_{X_i(t)=\varvec{x}}}\) is the density function of \(\widetilde{Y}(s)\) given \(X_i(t)=\varvec{x}\). At each iteration d of the MCMC, this integral can be approximated by the Monte Carlo technique as \(\mathbb {E}(P_{ik} (s) |_{X_i(t)=\varvec{x}}) \approx \hat{P}_{ik}(s,\varvec{x}) = \dfrac{1}{M} \sum _{m=1}^M \Phi (\widetilde{y}_m)\) where \(\widetilde{y}_m\) is randomly drawn from \(\mathcal {N} \left( \varvec{F}(s) \beta _k^{(d)} + \varvec{x} \varvec{\gamma }_k^{(d)}, \varvec{F}(s)^\top B_k^{(d)}\varvec{F}(s)^\top + \sigma _{\varepsilon _k^{(d)}}^2 \right)\) where \(^{(d)}\) indicates the value of the parameters at the MCMC iteration d. We considered \(M=1000\).

The mean and its 95%CI over the iterations are retained to describe the mean biomarker trajectory \(\hat{P}_ik(s,\varvec{x})\) over latent disease time for covariate profile \(\varvec{x}\) between \(s=-30\) years before dementia to \(s=5\) years after.


Participants in the analytical sample (N=2186) were aged 70.9 years (SD=8.7) on average, 61.7% were women, 39.2% had more than 12 years education and 29.9% carried at least 1 allele \(\epsilon\)4 of APOE gene (APOE4, a major genetic risk factor for ADRD). Additional description of the analytical sample at inclusion is reported in Table 1. During the 5-year follow-up, 284 participants developed dementia.

Table 1 Descriptive characteristics of the analysis sample at inclusion and over follow-up, MEMENTO cohort, France, 2011-2019 (N=2186)

A description of the distribution markers at baseline is reported in Table 1 and individual trajectories are displayed in Fig. 1 (A,B,C). Almost all participants had a least one cognitive measure at baseline, 2047 participants had volumetric MRI, 1236 had PET-FDG and 342 had CSF biomarkers. The number of repeated measures varied between 0 and 2 for CSF, MRI and FDG-PET, 0 and 6 for cognitive measures.

Estimated latent disease time

Individual estimated times of actual clinical dementia \(T_i^*\) were used to display the estimated delay to actual clinical dementia from entry in the cohort \(s_i(0)\) (Fig. 2). For instance, \(s_i(0) = -3\) corresponds to an estimated actual time of clinical dementia of 3 years after entering the cohort.

Fig. 2
figure 2

Posterior distribution of the estimated individual time to actual dementia at entry in the MEMENTO Cohort (France, 2011-2019, N=2186) according to the last dementia diagnosis status

Figure 1 (D) displays the individual observed markers’ trajectories according to the estimated clinical dementia time. Participants entered the cohort on average 10.3 years before the estimated actual clinical dementia onset with a range from 0.74 to 30.8 years. Among incident dementia cases, time to dementia onset varied between 0.74 and 6.15 years. These times to dementia were very close to the observed clinical dementia diagnoses with an inaccuracy ranging from 1.01 years prior to the estimated time and 0.59 years after the estimated time (while the constraints allowed up to +/- 1.5 years).

Covariates association with the biomarker levels

Figure 3 displays the mean and corresponding 95%CI of the covariates associations with each biomarker. All coefficients \(\gamma _k\) are reported in standard deviation (SD) of the considered marker and adjusted for the other covariates. Age was significantly associated with worse levels for all markers.

Fig. 3
figure 3

Estimated association of age, sex, education, APOE4 status, and first practice effect with each of the 12 biomarkers in the normalized scale, the MEMENTO Cohort, France, 2011-2019 (N=2186)

The association with sex showed a substantially greater severity for men in memory domain (mean difference (MD)=0.33, 95%CI=[0.26, 0.40]), hippocampal volume (MD=0.37, 95%CI=[0.29, 0.44]) and FDG-PET (MD = 0.39, 95%CI = [0.30, 0.48]), and to a lesser extent on amyloid level (CSF A\(\beta _{42}\)) (MD=0.28, 95%CI=[0.12, 0.45]), WMH volume (MD=0.14, 95%CI=[0.05, 0.22]) and cortical thicknesses of fusiform (MD=0.14, 95%CI=[0.06, 0.21]) and middle temporal (MD=0.14, 95%CI = [0.07, 0.22]). Men tended to have higher level of p-tau (MD=-0.18, 95%CI=[-0.36, 0.00]).

High education (>12 years) was related to substantially better scores at cognitive tests: memory (MD=-0.40, 95%CI=[-0.46, -0.434]), language (MD=-0.48, 95%CI=[-0.55, -0.42]) and executive function (MD=-0.29, 95%CI=[-0.36, -0.23]); high education was also slightly associated with lower degradation in FDG-PET (MD=-0.13, 95%CI=[-0.22, -0.05]) and hippocampal volume (MD=-0.11, 95%CI=[-0.19, -0.04]) but it was not related to cortical thicknesses, WMH volume or CSF markers.

APOE4 carriers displayed on average worse results on ADRD biomarkers with larger differences for A\(\beta _{42}\) (MD=0.60, 95%CI=[0.45, 0.74]), p-tau (MD=0.39, 95%CI=[0.25, 0.54]) and t-tau (MD=0.44, 95%CI=[0.29, 0.59]) in CSF.

Trajectories of the markers in the latent disease time

Figure 4 displays the averaged trajectories of markers’ progressions, between 30 years prior to clinical dementia to 5 years after, for a typical participant: woman of 70 years old, with more than 12 years of education and carrier of APOE4 allele. For better clarity 95%CI are not shown.

Fig. 4
figure 4

Mean trajectories of the 12 biomarkers of progression in the percentile scale according to latent disease time for a women of 70 years old, with more than 12 years of education and APOE4 carrier, the MEMENTO Cohort, France, 2011-2019 (N=2186)

Marker severities

Thirty years prior to clinical dementia, all the markers were on average at low levels of severity (below 25%) with the highest levels for total tau and p-tau (23% (95%CI=[0.14,0.32]) and 30% (95%CI=[0.20,0.40]), respectively). In comparison, the average 30% severity level was reached for A\(\beta _{42}\), WMH volume, volumetric neuroimaging (cortical thicknesses, hippocampal volume, FDG-PET), memory and executive functioning about 10 to 12 years later (that is 18-20 years prior to clinical dementia) and about 15 years later for verbal fluency. Total tau and p-tau remained the more impaired markers at all times although they degraded more slowly. WMH volume showed the fastest degradations with cortical thicknesses.

Order of marker changes

Figure 5 summarizes the order in which the markers reach 50% severity with the corresponding uncertainty (95%CI) in the time scale of the disease (years before clinical dementia) for the same typical covariate profile as previously. According to the weighted severity scale the 50th percentile may give an indication of the entry into moderate severity. This level was first reached by p-tau and total-tau 15.4 (95%CI=[10.6, 20.3]) and 13.4 (95%CI=[9.5,17.3]) years before clinical dementia. About 5 years later the moderate severity was reached by A\(\beta\)42 along with WMH volume, and cortical thicknesses of the middle temporal and entorhinal regions: respectively 9.1 (95%CI=[5.8, 12.3]), 9.3 (95%CI=[7.7, 11.0]), and 9.4 (95%CI=[7.8, 11.1]) years before clinical dementia. Then fusiform cortical thickness, hippocampus atrophy and glucose metabolism followed 1.5 to 3 years later with moderate severity reached 7.9 (95%CI=[6.3,9.5]), 7.8 (95%CI=[5.9,9.5]) and 6.2 (95%CI=[4.2,8.2]) years before clinical dementia, respectively. Finally, cognitive tests reached the moderate severity about 10 years after the p-tau in CSF. That was 4.2 (95%CI=[2.5,6.0]), 4.8 (95%CI=[3.0,6.7]) and 4.9 (95%CI=[2.9,6.9]) years before clinical dementia for language, memory and executive function, respectively.

Fig. 5
figure 5

Ordering sequence and uncertainty (95%CI) of the biomarkers reaching the moderate severity for a women of 70 years old, with more than 12 years of education and APOE4 carrier, the MEMENTO Cohort, France, 2011-2019 (N=2186)

Covariate profiles

Because of the differential effect of the covariates on the markers, the degradation sequence differed according to the profile participants. To give a better sense of the heterogeneity of the sequence, we displayed in Fig. 6 the averaged trajectories (along with 95%CI) of 4 landmark biomarkers (p-tau level, A\(\beta _{42}\) level, hippocampal volume and memory test score) according to education years and APOE4 status.

Fig. 6
figure 6

Average trajectories of 4 markers progression (A\(\beta\)42, p-Tau, hippocampal volume and FCSRT) according to latent disease time in the percentile scale for the 4 covariate profiles (education and APOE4), the MEMENTO Cohort, France, 2011-2019 (N=2186)

The anteriority of p-tau degradation was found mainly among the high education groups. Memory impairment progressed years later (among highly educated profiles) or contemporaneously (among less educated profiles) with p-tau level and hippocampus atrophy. A\(\beta\)42 level was the most variable marker in the sequence. It reached moderate severity level years later hippocampus atrophy and even after memory impairment in the profile APOE4 non carriers and low education while the degradation of A\(\beta\)42 marker was at about the same time as the one of hippocampus atrophy for APOE4 carriers.

Sensitivity analyses

The fit to the data was unchanged when considering a larger uncertainty around the observed clinical diagnoses with \(\epsilon _L\)=\(\epsilon _U\)=3 years (RMSE=0.0858 for both models, see Fig. S1 in supplementary materials for RMSE per marker), and the results remained virtually the same. We also compared our DPAM that assumed a log-normal distribution for the latent disease time and anchored the latent disease time along the clinical dementia diagnosis with a non-anchored disease progression model (similar to the LTJMM methodology [19]) in which the latent disease time definition was completely data-driven and the latent time shift distribution was assumed as normal. In this non-anchored model, the latent disease time is centered on the average stage of the analytical sample at entry in the cohort. The non-anchored approach performed very similarly as our DPAM with RMSE = 0.0825 and RMSE = 0.0858, respectively. The slight gain in RMSE of the non-anchored model was due to a slightly better fit of the neuropsychological data (see Fig. S1 in the supplementary materials for a RMSE separated by marker). In this model, the estimated latent disease times of individuals diagnosed with clinical dementia were very far from the actual time to clinical diagnosis with a span over 15 years (Fig. S2 in the supplementary materials). Indeed, as completely data-driven, the latent time shift was determined as the one homogenizing at most the data, and it was more influenced by the neuropsychological markers than MRI and CSF markers as the former brought much more information with more repeated measures. This underlines the importance of anchoring the model to realign the trajectories in link with the patient staging rather than only the inter-marker correlation.


We developed a disease progression model to describe the markers’ trajectories of the anatomo-clinical dimensions identified in ADRD progression towards clinical dementia. Using the intensive follow-up data of the French MEMENTO Cohort, we identified a large variability in the patients staging at study entry with an estimated time to actual clinical dementia spanning over 30 years. The sequence of markers progression substantially varied according to education and APOE4 status. However, we consistently identified p-tau as the first marker showing a pathological progression years before the onset of structural damage visible at brain imaging. Moreover, white matter hyperintensities, occurring concomitantly to regional brain atrophies, seemed to progress faster than other markers.

Compared to the rich literature on disease progression modelling using latent disease times ([17, 19,20,21, 28]), our adopted approach goes one step further. As previous works, we defined the latent disease time as an individual latent time shift shared by the disease markers and estimated it from the data. However, we also leveraged prior information on the clinical diagnoses to guide the estimation of the latent time-shift towards an actual clinical dementia diagnosis. As shown in the sensitivity analyses, without this prior information, the latent time-shift may over homogenize the trajectories of the markers. In contrast, anchoring the definition of latent time shift around the observed clinical diagnosis made it possible to realign markers trajectories around the actual time of clinical dementia (which is an important step in the clinical management of patients) while preserving the inherent heterogeneity in disease progression across individuals. In autosomal-dominant Alzheimer’s disease, Wang et al. [47] have also proposed to anchor a uni-dimensional disease progression model onto a pre-determined age of onset estimated through systematic review and meta-analysis according to the person-specific matching mutation in the DIAN observational study. Although initially motivated by sporadic ADRD challenges, our approach could also apply to DIAN study to realign multi-dimensional biomarker trajectories while accounting for the uncertainty surrounding the age of onset previously estimated on external data. Another difference with the DPM literature is that we assumed a lognormal distribution for the latent time shift rather than the more common normal distribution. Temporal shifts are timings and a such, they likely have an asymmetric distribution with an expected long tail for potentially very distant clinical dementia timings. In addition, by anchoring the latent disease time around the time of clinical dementia, the latent disease time is positive. We thus followed previous works in parametric time-to-event analyses [39, 48] and assumed a lognormal distribution for the prior distribution of the latent time shift.

In the MEMENTO cohort, estimates of individual disease times to clinical dementia at study entry extended over decades, a consistent result with previous studies [14]. Sequence and timing of markers along the natural history of the disease were partially consistent with the theoretical model of Jack et al. [8] and we found major differences in the sequence and timing of the markers according to the individual characteristics. Our findings supported that CSF p-tau showed increasing severity years before the degradation of glucose metabolism and brain atrophy on neuroimaging. Structural brain changes also preceded worsening of cognitive function. While amyloid deposit is widely considered as the initial cause of Alzheimer’s disease [49], timing of CSF A\(\beta\)42 was unclear as moderate severity was reached later than for CSF p-tau, a consistent result with previous disease progression model [50]. Indeed, timing of amyloid degradation substantially varied according to APOE4 status, thus contributing to the discussion challenging the central role of amyloid peptide in the natural history of the disease [51]. Our results also reinforce the hypothesis of small vessel disease contribution to cognitive impairment and dementia [3], as volume of white matter hyperintensities is the most rapidly deteriorating marker and contemporaneous with the degradation of cortical thicknesses years before cognitive impairment.

As any disease progression model, our approach relies on parametric assumptions. First, we restricted the present application to a linear trajectory in the normalized marker scales which translated into a sigmoid trajectory in the percentile scale. It was in line with previous disease progression models in Alzheimer’s disease [17, 19] and it was a requirement for the biomarker data as we had at most two repetitions. For the psychometric tests, the linear assumption could have been avoided by considering a higher-order polynomial trajectory on the normalized scale. However, this was not further investigated as the model with the linear assumption already showed good individual fits (Figs. S3-S7 in the supplementary materials). Second, to distinguish the inter-marker correlation due to the disease staging from the intra-marker correlation, we assumed that the latent time shift captured all the correlation shared across markers and considered that marker-specific random effects were independent between markers. This assumption could be relaxed by allowing some correlation between subsets of markers, for instance MRI-derived markers or neurospychological tests. Finally, we accounted for differences across covariate profiles through a global effect on each marker’s severity. Although of interest, considering interactions between individual characteristics and rate of marker degradation would substantially complicate the model and the estimation procedure due to the higher number of additional parameters to estimate. A few progression models considered a covariate effect on the disease time [21, 23] rather than on each marker measure separately. This may be interesting for exploring covariates that may delay the progression towards dementia. However, as found in our application, some covariates may differentially modulate markers’ trajectories and the sequence of markers’ degradation. This was the case for years of education that showed large differences only in the neuropsychological tests, as an illustration of the concept of cognitive reserve [52]. Finally, in cohorts on Alzheimer’s disease, dropouts and deaths occur and are likely to be linked to the disease progression. In the MEMENTO sample, 786 (36.0%) patients dropped out or died. By using the mixed model theory, our analyses are robust to missing data under the missing at random (MAR) mechanism which stipulates that the probability of missing data can be fully predicted by the observations. Given the large number of repeated markers we included, the MAR assumption for dropout and death is highly plausible (although not checkable) and our results should not be impacted. Nevertheless, accounting for informative dropout and death would be possible by incorporating the DPAM into a joint modeling approach of the risk of dropout and death [53]. We leave this to future work.


Disease progression models allow for the characterization of the complete natural history of a disease when observed time is not relevant as it is the case for ADRD. Applied on the MEMENTO clinic-based cohort, we showed that the shift of individual trajectories into a latent disease time scale extended over 30 years prior to dementia clinical diagnosis. This original work brings new insights in the understanding of the natural history of ADRD biomarkers both as we used information from actual diagnosis time of clinical dementia to estimate the latent time underlying the long term progression of the markers and as we based our work on a large cohort when most published work rely on ADNI data. Replicating the analyses on a population-based representative sample would however be highly valuable to assess the generalizability of the findings.

Availability of data and materials

MEMENTO data access request is available via the Dementia Platform UK Data Access application form ( or via the MEMENTO Secretariat ( The Stan script used for the model specification are available at


A\(\beta _{42}\) :

amyloid beta 42


Alzheimer’s disease and related dementia


allele \(\epsilon\)4 of the apolipoproteine E gene


clinical dementia rate


clinical dementia rate sum of the boxes


confidence interval


cerebrospinal fluid


disease progression anchored model


free and cued reminding test




latent time joint mixed model


mean difference


magnetic resonance imaging


positron emission tomography


phosphorylated tau


root mean squared error


standard uptake value ratio


trail making test A


total tau


white matter hyperintensities


  1. McKhann GM, Knopman DS, Chertkow H, Hyman BT, Jack CR, Kawas CH, et al. The diagnosis of dementia due to Alzheimer’s disease: Recommendations from the National Institute on Aging-Alzheimer’s Association workgroups on diagnostic guidelines for Alzheimer’s disease. Alzheimers Dement. 2011;7(3):263–9.

    Article  PubMed  PubMed Central  Google Scholar 

  2. Braak H, Braak E. Neuropathological stageing of Alzheimer-related changes. Acta Neuropathol. 1991;82(4):239–59.

    Article  PubMed  CAS  Google Scholar 

  3. Zlokovic BV, Gottesman RF, Bernstein KE, Seshadri S, McKee A, Snyder H, et al. Vascular contributions to cognitive impairment and dementia (VCID): A report from the 2018 National Heart, Lung, and Blood Institute and National Institute of Neurological Disorders and Stroke Workshop. Alzheimers Dement. 2020;16:1714–33.

    Article  PubMed  Google Scholar 

  4. Azarpazhooh MR, Avan A, Cipriano LE, Munoz DG, Sposato LA, Hachinski V. Concomitant vascular and neurodegenerative pathologies double the risk of dementia. Alzheimers Dement. 2018;14(2):148–56.

    Article  PubMed  Google Scholar 

  5. Bos D, Wolters FJ, Darweesh SKL, Vernooij MW, de Wolf F, Ikram MA, et al. Cerebral small vessel disease and the risk of dementia: A systematic review and meta-analysis of population-based evidence. Alzheimers Dement. 2018;14(11):1482–92.

    Article  PubMed  Google Scholar 

  6. Schröder J, Pantel J. Neuroimaging of hippocampal atrophy in early recognition of Alzheimer’s disease - a critical appraisal after two decades of research. Psychiatry Res Neuroimaging. 2016;247:71–8.

    Article  PubMed  Google Scholar 

  7. Berron D, van Westen D, Ossenkoppele R, Strandberg O, Hansson O. Medial temporal lobe connectivity and its associations with cognition in early Alzheimer’s disease. Brain. 2020;143:1233–48.

    Article  PubMed  PubMed Central  Google Scholar 

  8. Jack CR, Knopman DS, Jagust WJ, Shaw LM, Aisen PS, Weiner MW, et al. Hypothetical model of dynamic biomarkers of the Alzheimer’s pathological cascade. Lancet Neurol. 2010;9(1):119–28.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  9. Jack CR, Knopman DS, Jagust WJ, Petersen RC, Weiner MW, Aisen PS, et al. Tracking pathophysiological processes in Alzheimer’s disease: An updated hypothetical model of dynamic biomarkers. Lancet Neurol. 2013;12(2):207–16.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  10. Bateman RJ, Xiong C, Benzinger TLS, Fagan AM, Goate A, Fox NC, et al. Clinical and Biomarker Changes in Dominantly Inherited Alzheimer’s Disease. N Engl J Med. 2012;367(9):795–804.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  11. Villemagne VL, Burnham S, Bourgeat P, Brown B, Ellis KA, Salvado O, et al. Amyloid β deposition, neurodegeneration, and cognitive decline in sporadic Alzheimer’s disease: A prospective cohort study. Lancet Neurol. 2013;12(4):357–67.

  12. Laird NM, Ware JH. Random-effects models for longitudinal data. Biometrics. 1982;38(4):963–74.

    Article  PubMed  CAS  Google Scholar 

  13. Lindstrom MJ, Bates DM. Nonlinear Mixed Effects Models for Repeated Measures Data. Biometrics. 1990;46(3):673–87.

    Article  PubMed  CAS  Google Scholar 

  14. Vermunt L, Sikkes SAM, van den Hout A, Handels R, Bos I, van der Flier WM, et al. Duration of preclinical, prodromal, and dementia stages of Alzheimer’s disease in relation to age, sex, and APOE genotype. Alzheimers Dement. 2019;15(7):888–98.

    Article  PubMed  PubMed Central  Google Scholar 

  15. Petersen RC, Aisen PS, Beckett LA, Donohue MC, Gamst AC, Harvey DJ, et al. Alzheimer’s Disease Neuroimaging Initiative (ADNI) Clinical characterization. Neurology. 2010;74(3):201–9.

  16. Dufouil C, Dubois B, Vellas B, Pasquier F, Blanc F, Hugon J, et al. Cognitive and imaging markers in non-demented subjects attending a memory clinic: Study design and baseline findings of the MEMENTO cohort. Alzheimers Res Ther. 2017;9(1):1–13.

    Article  CAS  Google Scholar 

  17. Jedynak BM, Lang A, Liu B, Katz E, Zhang Y, Wyman BT, et al. A computational neurodegenerative disease progression score: Method and results with the Alzheimer’s disease neuroimaging initiative cohort. NeuroImage. 2012;63(3):1478–86.

    Article  PubMed  Google Scholar 

  18. Donohue MC, Jacqmin-Gadda H, Le Goff M, Thomas RG, Raman R, Gamst AC, et al. Estimating long-term multivariate progression from short-term data. Alzheimers Dement. 2014;10(5):S400–10.

    Article  PubMed  PubMed Central  Google Scholar 

  19. Li D, Iddi S, Thompson WK, Donohue MC. Bayesian latent time joint mixed effect models for multicohort longitudinal data. Stat Methods Med Res. 2017;28(3):835–45.

    Article  PubMed  Google Scholar 

  20. Lorenzi M, Filippone M, Frisoni GB, Alexander DC, Ourselin S. Probabilistic disease progression modeling to characterize diagnostic uncertainty: Application to staging and prediction in Alzheimer’s disease. NeuroImage. 2017;2017(190):56–68.

    Article  Google Scholar 

  21. Raket LL. Statistical Disease Progression Modeling in Alzheimer Disease. Front Big Data. 2020;3(August):1–18.

    Article  Google Scholar 

  22. Garbarino S, Lorenzi M. Modeling and Inference of Spatio-Temporal Protein Dynamics Across Brain Networks. Lect Notes Comput Sci (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2019;11492 LNCS:57–69.

  23. Kühnel L, Berger AK, Markussen B, Raket LL. Simultaneous modeling of Alzheimer’s disease progression via multiple cognitive scales. Stat Med. 2021;40(14):3251–66.

    Article  PubMed  Google Scholar 

  24. Bilgel M, Prince JL, Wong DF, Resnick SM, Jedynak BM. A multivariate nonlinear mixed effects model for longitudinal image analysis: Application to amyloid imaging. NeuroImage. 2016;134:658–70.

    Article  PubMed  Google Scholar 

  25. Marinescu RV, Eshaghi A, Lorenzi M, Young AL, Oxtoby NP, Garbarino S, et al. A vertex clustering model for disease progression: Application to cortical thickness images. Lect Notes Comput Sci (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2017;10265 LNCS:134–145.

  26. Schiratti JB, Allassonnière S, Colliot O, Durrleman S. A Bayesian mixed-effects model to learn trajectories of changes from repeated manifold-valued observations. Mach Learn Res. 2017;18:1–33.

    Google Scholar 

  27. Koval I, Bône A, Louis M, Lartigue T, Bottani S, Marcoux A, et al. AD Course Map charts Alzheimer’s disease progression. Sci Rep. 2021;11(1):1–16.

    Article  CAS  Google Scholar 

  28. Bilgel M, Jedynak BM. Predicting time to dementia using a quantitative template of disease progression. Alzheimers and Dement Diagn Assess Dis Monit. 2019;11:205–15.

    Article  Google Scholar 

  29. Desikan RS, Cabral HJ, Fischl B, Guttmann CRG, Blacker D, Hyman BT, et al. Temporoparietal MR Imaging Measures of Atrophy in Subjects with Mild Cognitive Impairment That Predict Subsequent Diagnosis of Alzheimer Disease. Am J Neuroradiol. 2009;30:532.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  30. Fischl B, Kouwe AVD, Destrieux C, Halgren E, Ségonne F, Salat DH, et al. Automatically Parcellating the Human Cerebral Cortex. Cereb Cortex. 2004;14:11–22.

    Article  PubMed  Google Scholar 

  31. Chupin M, Hammers A, Liu RS, Colliot O, Burdett J, Bardinet E, et al. Automatic segmentation of the hippocampus and the amygdala driven by hybrid constraints: method and validation. Neuroimage. 2009;46:749–61.

    Article  PubMed  CAS  Google Scholar 

  32. Buchert R, Wilke F, Chakrabarti B, Martin B, Brenner W, Mester J, et al. Adjusted scaling of FDG positron emission tomography images for statistical evaluation in patients with suspected Alzheimer’s disease. J Neuroimaging. 2005;15(4):348–55.

    Article  PubMed  Google Scholar 

  33. Habert MO, Marie S, Bertin H, Reynal M, Martini JB, Diallo M, et al. Optimization of brain PET imaging for a multicentre trial: the French CATI experience. EJNMMI Phys. 2016;3:6.

    Article  PubMed  PubMed Central  Google Scholar 

  34. Samaille T, Fillon L, Cuingnet R, Jouvent E, Chabriat H, Dormont D, et al. Contrast-based fully automatic segmentation of white matter hyperintensities: method and validation. PLoS One. 2012;7(11):e48953.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  35. Weintraub S, Wicklund AH, Salmon DP. The neuropsychological profile of Alzheimer disease. Cold Spring Harbor Perspectives in Medicine. 2012;2.

  36. Grober E, Buschke H, Crystal H, Bang S, Dresner R. Screening for dementia by memory testing. Neurology. 1988;38(6):900–3.

    Article  PubMed  CAS  Google Scholar 

  37. Thurstone LL. Psychophysical analysis. Am J Psychol. 1987;100:587.

    Article  PubMed  CAS  Google Scholar 

  38. Tombaugh TN. Trail Making Test A and B: Normative data stratified by age and education. Arch Clin Neuropsychol. 2004;19(2):203–14.

    Article  PubMed  Google Scholar 

  39. Kalbfleisch JD, Prentice RL. The statistical analysis of failure time data. Hoboken: Wiley; 2002.

  40. Harrell Jr FE. Hmisc: Harrell Miscellaneous. 2021. R package version 4.6-0.

  41. Vivot A, Power MC, Glymour MM, Mayeda ER, Benitez A, Spiro A, et al. Jump, Hop, or Skip: Modeling Practice Effects in Studies of Determinants of Cognitive Change in Older Adults. Am J Epidemiol. 2016;183(4):302–14.

    Article  PubMed  PubMed Central  Google Scholar 

  42. Hoffman MD, Gelman A. The no-U-turn sampler: Adaptively setting path lengths in Hamiltonian Monte Carlo. J Mach Learn Res. 2014;15:1593–623.

    Google Scholar 

  43. Carpenter B, Gelman A, Hoffman MD, Lee D, Goodrich B, Betancourt M, et al. Stan: A probabilistic programming language. J Stat Softw. 2017;76(1):1–32.

  44. Stan Development Team. Stan Modeling Language Users Guide and Reference Manual. 2022. Version 2.30.

  45. Gelman A, Rubin DB. Inference from Iterative Simulation Using Multiple Sequences. Stat Sci. 1992;7(4):457–72.

    Article  Google Scholar 

  46. Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, Rubin DB. Bayesian Data Analysis. 3rd ed. Boca Raton: Chapman & Hall/CRC Texts in Statistical Science; 2013.

  47. Wang G, Berry S, Xiong C, Hassenstab J, Quintana M, McDade EM, et al. A novel cognitive disease progression model for clinical trials in autosomal-dominant Alzheimer’s disease. Stat Med. 2018;37(21):3047–55.

    Article  PubMed  PubMed Central  Google Scholar 

  48. Jacqmin-Gadda H, Commenges D, Dartigues JF. Random Changepoint Model for Joint Modeling of Cognitive Decline and Dementia. Biometrics. 2006;62:254–60.

    Article  PubMed  PubMed Central  Google Scholar 

  49. Jack CR, Bennett DA, Blennow K, Carrillo MC, Dunn B, Haeberlein SB, et al. NIA-AA Research Framework: Toward a biological definition of Alzheimer’s disease. Alzheimers Dement. 2018;14(4):535–62.

    Article  PubMed  PubMed Central  Google Scholar 

  50. Li D, Iddi S, Thompson WK, Rafii MS, Aisen PS, Donohue MC. Bayesian latent time joint mixed-effects model of progression in the Alzheimer’s Disease Neuroimaging Initiative. Alzheimers Dement Diagn Assess Dis Monit. 2018;10:657–68.

    Article  Google Scholar 

  51. Frisoni GB, Altomare D, Thal DR, Ribaldi F, van der Kant R, Ossenkoppele R, et al. The probabilistic model of Alzheimer disease: the amyloid hypothesis revised. Nat Rev Neurosci. 2022;23(1):53–66.

    Article  PubMed  CAS  Google Scholar 

  52. Arenaza-Urquijo EM, Vemuri P. Resistance vs resilience to Alzheimer disease. Neurology. 2018;90:695–703.

    Article  PubMed  PubMed Central  Google Scholar 

  53. Saulnier T, Philipps V, Meissner WG, Rascol O, Pavy-Le Traon A, Foubert-Samier A, et al. Joint models for the longitudinal analysis of measurement scales in the presence of informative dropout. Methods (San Diego, Calif). 2022;203:142–51.

    Article  PubMed  CAS  Google Scholar 

Download references


The authors thank the members of the MEMENTO study group listed in supplementary Table S1.


The MEMENTO cohort is funded by the Fondation Plan Alzheimer (Alzheimer Plan 2008-2012), and the French Ministry of Research (MESRI, DGRI) through the Plan Maladies Neurodégénératives (2014-2019). This work was also supported by CIC 1401-EC, Bordeaux University Hospital (CHU Bordeaux, sponsor of the cohort), Inserm, and the University of Bordeaux. This work received funding from the French National Research Agency (ANR) as part of the Investment for the Future Programme ANR-18-RHUS-0002. The MEMENTO cohort has received funding support from AVID, GE Healthcare, and FUJIREBIO through private-public partnerships. The Insight-PreAD substudy was promoted by INSERM in collaboration with the Institut du Cerveau et de la Moelle Epinière, Institut Hospitalo-Universitaire, and Pfizer and has received support within the “Investissement d’Avenir” (ANR-10-AIHU-06) program. This work was also supported by the Fondation Vaincre Alzheimer (FR-20022 project ID3M 2021-2023). The funders had no role in study design, in data collection, analysis, and interpretation, or in writing of report. The corresponding author had full access to all the data in the study and had final responsibility for the decision to submit for publication.

Author information

Authors and Affiliations



JL, CD, and CPL conceived and designed the study; CD collected the data; JL and CPL developed the statistical methodology; JL, CD and CPL analyzed the data and wrote the paper. JL had full access to all of the data in the study. All authors take responsibility for the integrity of the data and the accuracy of the data analysis. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Cécile Proust-Lima.

Ethics declarations

Ethics approval and consent to participate

This study was performed in accordance with the guidelines of the Declaration of Helsinki. The MEMENTO study protocol has been approved by the local ethics committee (“Comité de Protection des Personnes Sud-Ouest et Outre Mer III”; approval number 2010-A01394-35). All participants provided written informed consent.

Consent for publication

Not applicable

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lespinasse, J., Dufouil, C. & Proust-Lima, C. Disease progression model anchored around clinical diagnosis in longitudinal cohorts: example of Alzheimer’s disease and related dementia. BMC Med Res Methodol 23, 199 (2023).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: