Modelling of longitudinal data to predict cardiovascular disease risk: a methodological review

Stevens, David; Lane, Deirdre A.; Harrison, Stephanie L.; Lip, Gregory Y. H.; Kolamunnage-Dona, Ruwanthi

doi:10.1186/s12874-021-01472-x

BMC Medical Research Methodology

Table 3 Summary of single-stage models used to incorporate longitudinal data in survival models

From: Modelling of longitudinal data to predict cardiovascular disease risk: a methodological review

Method N(%)[refs]	Longitudinal outcome type	Disease outcome type	How the longitudinal data were used in the analysis N (%) [refs]	Reason for the use of method	Assumptions	Pros	Cons
Single-stage approaches (n = 40)
Cox model, N = 25 (62.5) [18, 19, 21, 25, 28, 29, 32, 34,35,36, 38, 39, 41,42,43, 45, 47, 49,50,51, 53,54,55,56,57]	Continuous, Categorical	Time to event	Baseline only, N = 7 (17.5) [18, 21, 24, 43, 50, 53, 54] Continuous, N = 6 (15.0) [18, 21, 43, 50, 53, 54] Categorised, N = 2 (5.0) [24, 54]	To clinically relevant time point to be used for prediction	PH	Simple method	Dependence between measurement times is ignored
	Continuous	Time to event	Change from baseline, N = 3 (7.5) [28, 35, 38]	To incorporate change over time	PH; Change is linear	Incorporates more than one time point	Only looks at pairs of time points
	Continuous	Time to event	Slope calculated manually, N = 3 (7.5) [25, 29, 32]	To incorporate constant change in the survival model	PH; Change is linear	Incorporates more than one time point	Only looks at pairs of time points
	Continuous	Time to event	Average (categorized before use),^a N = 1 (2.5) [36]	To incorporate the average change over time	PH; Constant between time points; Change is linear	Incorporates the average impact over time	Interpretation unclear
	Continuous, Categorical	Time to event	Time-dependent covariate, N = 6 (15.0) [39, 42, 45, 47, 49, 51, 55]	To incorporate change in exposure variable over time	PH; Change is constant between two consecutive time points; Longitudinal data are measured without error	Incorporates time-varying measures over the follow-up period	Computationally slower as compared to time-fixed covariates; Computationally infeasible if the longitudinal outcome is measured at different time points for different individuals; Interpretation is difficult; Can lead to great overfitting of the data; must be used with caution
	Continuous	Time to event	Summary measures(Standard deviation, number of drops between observations), N = 1 (2.5) [19]	To incorporate variability summaries of the longitudinal data	PH	Incorporates variability of measures into the model	Summary measures fairly specific to dataset
	Continuous, Categorical	Time to event	Change in category between first and last time-point categorized, N = 2 (5.0) [34, 41] Change in continuous variable between time points categorized with manually defined cut-offs, N = 1 (2.5) [56]	To summarise trajectories in an interpretable way	PH	Results interpretable	Groups manually selected based on data which could lead to bias
Hierarchical Cox model to adjust for multiple studies, N = 1 (2.5) [20]	Continuous	Time to event	Continuous measurements categorized. Multiple time points also categorised as consistent/non-consistent, N = 1 (2.5) [20]	To summarise trajectories in an interpretable way adjusting for combining multiple studies	PH	Results interpretable; Adjusts for use of multiple studies	Groups manually selected based on data which could lead to bias
Logistic Regression, N = 3 (7.5) [30, 31, 48]	Continuous	Binary	Baseline only, N = 1 (2.5) [31]	Allows clinically relevant time point to be used for prediction	Not applicable	Simple method	Dependence between measurement times is ignored
	Categorical	Binary	Separate time points, N = 1 (2.5) [30]	To include all predictive values in model	Not applicable	Simple method	Caution needed for multicollinearity
	Continuous	Binary	Summaries of repeated measures • Standard deviation • Mean • Mean change from baseline • Average daily risk range^b • Range N = 1 (2.5) [48]	Includes different measures of variation	Not applicable	Simple method	Interpretation of different summary measures non-trivial
GEE - logit link N = 2 (5.0) [17, 27]	Continuous	Binary	Non-linear relationships considered through piecewise models or splines, N = 1 (2.5) [17]	To attempt to include a variety of shapes of relationships in the model using data from all time points	Not applicable	Includes all measured values of longitudinal variable with various relationships with risk	Splines harder to interpret; Produces population averages not individual predictions
GEE - logit link N = 2 (5.0) [17, 27]	Continuous	Binary	Multiple time points, N = 1 (2.5) [27]	To include values and change at all time points	Not applicable	Includes all measured values of longitudinal variable	Produces population averages not individual predictions
GEE – log link, N = 2 (5.0) [22, 37]	Continuous	Rates	Multiple time points, N = 1 (2.5) [37] Multiple time points categorized as stable, increasing (in the second or third time point), decreasing, unstable, N = 1 (2.5) [22]	To include all time points in predicting rates	Not applicable	Includes all measured values of longitudinal variable	Produces population averages not individual predictions
Poisson regression, N = 2 (5.0) [23, 26]	Continuous	Rates	Baseline only, N = 2 (5.0) [23, 26]	To enable modelling of baseline rate	Not applicable	Enables modelling of baseline rate in a parametric manner	Dependence between measurement times is ignored
Linear Mixed Effects model, N = 4 (10.0) [33, 44, 46, 96]	Continuous, categorical	Continuous	Repeated measures, N = 4 (10.0) [33, 44, 46, 96]	To predict changes over time	Random effects are independent of covariates	Includes all measured values of longitudinal variable	None
Fixed effects linear regression, N = 1 (2.5) [52]	Continuous, categorical	Continuous	The variable is transformed by subtracting patient-level mean to remove between patient variation. N = 1 (2.5) [52]	To predict changes over time	Not applicable	Includes all measured values of longitudinal variable; Relaxes assumption of independence of random effects from covariates; Computationally very easy to fit compared with mixed effects models	Lower statistical efficiency than mixed effects models

PH - Proportional Hazards
^a Average BMI _total = ((BMI-67 x time_I-II) + (BMI-85 x time_II-III) + (BMI-96 x time_III-))/time_total
Total weight change = (((BMI-67 - BMI-85) x time_I-II) + ((BMI-85 - BMI-96) x time_II-III))/time_I-III.
BMI deviation = absolute value of (BMI-85 - (BMI67 + BMI-96)/2).
^b Calculated as the average daily risk of either hypoglycemia or hyperglycemia

Back to article page

ISSN: 1471-2288

Contact us

General enquiries: journalsubmissions@springernature.com