 Research
 Open Access
 Published:
Modelling multiple timescales with flexible parametric survival models
BMC Medical Research Methodology volume 22, Article number: 290 (2022)
Abstract
Background
There are situations when we need to model multiple timescales in survival analysis. A usual approach in this setting would involve fitting Cox or Poisson models to a timesplit dataset. However, this leads to large datasets and can be computationally intensive when model fitting, especially if interest lies in displaying how the estimated hazard rate or survival change along multiple timescales continuously.
Methods
We propose to use flexible parametric survival models on the log hazard scale as an alternative method when modelling data with multiple timescales. By choosing one of the timescales as reference, and rewriting other timescales as a function of this reference timescale, users can avoid timesplitting of the data.
Result
Through casestudies we demonstrate the usefulness of this method and provide examples of graphical representations of estimated hazard rates and survival proportions. The model gives nearly identical results to using a Poisson model, without requiring timesplitting.
Conclusion
Flexible parametric survival models are a powerful tool for modelling multiple timescales. This method does not require splitting the data into small timeintervals, and therefore saves time, helps avoid technological limitations and reduces room for error.
Introduction
In survival analysis, the event rates may depend on multiple timescales simultaneously, such as timeonstudy, attained age, time since disease onset, etc, all with different timeorigins, such as start date of the study, birth, onset of disease, etc. To fit survival models with multiple timescales, it is standard practice to split the data into short timeintervals along the relevant timescales, and fit a Cox model or a Poisson generalised linear model with categories for the timeintervals [1]. Using the Cox model may not be the best approach if we are interested in modelling the rates over the multiple timescales, since the baseline hazard function is not estimated in the Cox model, and one of the timescales has to be selected as the baseline [2]. Furthermore, fitting either the Cox or Poisson model to the timesplit data is based on the assumption of piecewise constant hazard rates within the timeintervals. Fractional polynomials [3] or splines [4] can be used instead to obtain smooth estimates of the baseline hazard function in the Poisson model. Even so, this requires splitting the data into short intervals which often leads to very large datasets, thus adding to the challenges of fitting computationally burdensome models.
In this article, we propose using flexible parametric survival models (FPMs) [5, 6] as an alternative approach when fitting hazard models with multiple timescales. This method allows for modelling multiple timescales as continuous functions and does not require timesplitting. FPMs with two timescales are presented through two scenarios. In the first scenario, the second timescale is introduced when individuals experience an intermediate event, which is considered as a timevarying covariate. In the second scenario, both timescales are present from the start of the study but the second timescale is relevant only for a subset of individuals in the matched cohort. We also compare the results to those from fitting Poisson models. The choice of the optimal single timescale or how to combine multiple timescales into one timescale has been widely discussed [7,8,9,10,11,12] and is not the focus of this study.
Notation and background
Proportional hazards models with multiple timescales
Modelling hazard rates in terms of multiple timescales has been described in different settings, for example in age, period and cohort models [13] and in the Lexis model with two timeaxes [14], and the twoway proportional hazards (PH) model [15]. A thorough description of the general approach to modelling multiple timescales within multistate models using Poisson framework has been provided by Iacobelli and Carstensen (2013) [16].
In a model with two timescales and proportional hazards, i.e. a model where timescales are modelled as main effects, the log hazard function can be expressed as:
where \(p_0\) and \(s_0\) are the baseline hazard functions (which can be any functions) for timescales \(t_1\) and \(t_2\), respectively, with \(\boldsymbol{\gamma }_{p}\) and \(\boldsymbol{\gamma }_{s}\) as the corresponding parameter vectors. An intercept is included in one of the two functions. The covariates of interest are expressed as \(\boldsymbol{x}\) with associated log hazard ratios \(\boldsymbol{\beta }\).
Provided chosen timescales are measured in the same units of time, and their timeorigins are known then the timescales can be expressed as functions of one reference timescale and corresponding offset terms. For example, if \(t_1\) is time since diagnosis and \(t_2\) is attained age then the timeorigin for \(t_1\) is diagnosis and for \(t_2\) it is birth. Then using age at diagnosis, \(a_0\), as an offset, we can write \(t_2 = t_1 + a_0\), or symmetrically \(t_1 = t_2  a_0\). Thus, with \(t_1\) as the reference timescale, model (1) can be written as,
Additionally, if a third timescale of interest is the calendar time, \(t_3\) (origin 0 CE), then we can write it as a function of \(t_1\), as \(t_3 = t_1 + c_0\), or as a function of \(t_2\) as \(t_3 = t_2  a_0 + c_0\), where \(c_0\) is the calendar date of diagnosis and acts as another offset term. To demonstrate further the symmetry of model (2) with additional timescale, we can express the model with three timescales \(t_1, t_2, t_3\) as a function of the reference timescale \(t_2\) and offset terms \(a_0, c_0\),
Nonproportional hazards models with multiple timescales
Models (1)  (3) can be extended further to include nonproportional hazards, i.e. interactions between covariates and one or more of the timescales. For example, model (1) with interaction between covariates \(x_l\) (\(l=1,\dots , L\)) and timescale \(t_1\) is written as:
The covariates included in \(\varvec{x}\) can be of any functional form and therefore, in model (4) it is not necessarily a linear form that is used in the interaction term with the timescale. Additionally, interactions between timescales can also be included, which means that the effect of the first timescale on the outcome can differ along the second timescale. In further discussions and examples, we will focus on fitting PH models with two timescales using flexible parametric survival models and compare them to Poisson models.
Flexible parametric survival models with multiple timescales
Flexible parametric survival models (FPMs) were introduced by Royston and Parmar (2002) [5] and have been further developed by others [6, 17, 18]. FPMs on the loghazard scale use a smoothing function for the baseline hazard in a form of restricted cubic splines, which are piecewise cubic polynomial functions that are joined at prespecified positions (knots). The restricted cubic splines are often created on the logscale of time. Users are required to choose a number of knots for the splines, as well as the position of the knots. Sensitivity analyses have shown that estimates produced by FPMs are robust to the number and placement of knots [19].
As described in Section 2.1, if the timescales of interest can be expressed in terms of one timescale and the offset terms then, in its simplest form, the PH FPM on the log hazard scale with two timescales, \(t_1\) and \(t_2\) is written as
where the baseline hazard functions on the \(t_1\) and \(t_2\) timescales are represented by the restricted cubic spline functions, \(p_0(t_1; \boldsymbol{\gamma }_{p}, \boldsymbol{k}_{p})\) and \(s_0(t_2; \boldsymbol{\gamma }_{s}, \boldsymbol{k}_{s})\), respectively. The knot location vectors, \(\boldsymbol{k}_{p}\) and \(\boldsymbol{k}_{s}\) are either chosen by default at equally spaced centiles over the distribution of the eventtimes or determined by the user. Also this model can be easily extended to include interactions between the timescales as well as nonproportional hazards for the covariates of interest.
Poisson model with multiple timescales
We can also use the Poisson approach to fit the same underlying rate models described in Section 2.1 since the likelihood of the rate model is equivalent to the likelihood of a Poisson model [15, 16, 20]. To do this, users are required to split the data along the relevant timescales into intervals short enough to better support the assumption of constant hazard rates within the intervals, which can come at the cost of high computational burden. Depending on the research area, the data can be split along one timescale, and the other timescale is obtained using the offset between timescales. However, in some situations splitting is performed along two or more timescales. The baseline hazard functions for the timescales can be either piecewise constant, or smooth functions based on the intervals (for example, the start or the midpoint of each interval). The log of persontime within each interval is included in the model as an offset.
For consistency with the FPM approach, in this study we use restricted cubic splines when using the Poisson approach to estimate the smooth effect of the timescales on the baseline hazard.
Nonproportional hazards as shown by model (4) as well as interactions between timescales can also be modelled within the Poisson framework.
Application scenarios
To illustrate models with multiple timescales we analysed two different casestudies, using both FPMs and Poisson models. The models displayed are essentially the same, however they illustrate different scenarios where multiple timescales can arise. The first casestudy includes a timevarying variable that introduces a second timescale upon changing its value, and the second is a matched cohort study where the second timescale is relevant only for the exposed subjects. In both casestudies our objective was to model mortality rates with two timescales, and graphically represent the estimated rates and the survival proportions over different timepoints on both timescales.
Timevarying covariate
Similar to the illnessdeath example discussed by Iacobelli and Carstensen (2013) [16], for modelling multiple timescales, we analysed cohort data of 2,982 individuals diagnosed during 19781993 with primary breast cancer in Rotterdam, who had undergone primary surgery [21, 22]. Patients were followed from the initial state surgery until the event of interest death or censoring at 10 years post surgery. Throughout the followup it is also known whether and when they experienced the intermediate state relapse or metastasis (RM). RM can, therefore, be treated as a timevarying covariate, and time since RM as a secondary timescale.
In total 1,139 patients died during the total followup of 19,937 personyears (overall mortality rate 57 per 1000 personyears), and 1,004 of these deaths occurred after RM. During followup, 1,477 patients experienced RM, and the median time to RM among those with RM was 2.43 years (min = 0.1, max = 9.99).
Using the FPM and the Poisson methods, we fitted the following hazard model with two timescales, \(t_1\) as time since surgery (reference timescale) and \(t_1  r\) as time since RM with r as the time of RM,
where \(I_{RM}\) was used as a timevarying indicator for patients with RM, so that the effect of time since RM on the baseline hazard can only be estimated after patients have transitioned to RM. In both methods, for the restricted cubic spline functions, \(p_0, s_0\), we chose four knots equally placed over the distribution of eventtimes, and we used the same knots in both methods, however, in \(p_0\) we created splines for \(\log (t_1)\). An intercept of the model is included in function \(p_0\) as part of the parameters for the reference timescale \(t_1\). As in the model above \(s_0\) does not include an intercept. There is a choice to be made whether to include a parameter for an immediate change in the hazard at the time of RM, and setting \(\beta _{RM}\) to 0 would remove this change. We also included in the model a restricted cubic spline function of age (age at surgery) with six knots and an indicator \(I_{horm}\) for hormonal therapy, assuming proportional hazards for these covariates over both timescales by not including interactions with either of the timescales.
To fit this model, it was necessary to create two rows per individuals for those that experienced RM, where indicator \(I_{RM}\) changed from 0 to 1 at the time of RM. Furthermore, for the Poisson approach, we split the reference timescale, time since surgery, into twodayintervals, and used midpoints of the intervals for creating the splines. For this cohort it may be unnecessary to have such short intervals, especially in later part of the followup, but for other diseases it may be essential.
For the purpose of this study, we chose to keep the model simple in terms of the number of covariates, and we did not focus on the question of modelspecification. Therefore, the results may not be clinically representative.
The implementation of analyses for the FPM method for this cohort using Stata code is provided in the Supplementary material.
Graphical representations of results for model with a timevarying covariate
The FPM and Poisson approaches yielded almost identical results; coefficients and 95% confidence intervals (CI) are displayed in Table 1. The spline coefficients are not interpretable on their own but they are used to predict the shape of the hazard surface at different covariate values. For example, Fig. 1 displays two panels for viewing the estimated mortality rates from the FPM. The left panel shows the estimated mortality rates per 1000 personyears along time since surgery (timescale \(t_1\)) for patients who did not have RM and for patients who experienced RM at 0.5, 1, 2, 3, 4, 5 years post surgery, respectively. The jumps from “No RM” at these timepoints postsurgery into the RM state are represented by the vertical lines. The right panel of Fig. 1 provides an alternative view, along the second timescale, time since RM (timescale \(t_2 = t_1r\), with r as offset as time of RM post surgery). From both panels, as from Supplementary Fig. 1, we see that the mortality rates decrease with both time since surgery and with time since RM. Patients treated without the hormonal therapy also have lower mortality rates than patients who received hormonal therapy. The purpose of this article is to demonstrate the methods, therefore it is not recommended to interpret the results in the clinical sense. Nevertheless, the high mortality among patients treated with hormonal therapy can be due to a more severe disease for this group of patients. As in Fig. 1, in subsequent figures, we displayed results from the FPM only, as there was a complete overlap between the two methods of estimated mortality rates and survival proportions. Furthermore, for demonstration purposes, all the predicted values were calculated for patients who had primary surgery at age 50.
Supplementary Fig. 1 shows the estimated mortality rates per 1000 personyears over both timescales in threedimensional view for patients who had primary surgery at age 50, received hormonal therapy and experienced RM. It can be seen that the peak of the surface with grey and orange belts (mortality rates higher than 600) is concentrated over the quadrant of one to three years since surgery and one to three years since RM. The mortality rates surface becomes more “shallow” with progression of time on both timescales similar to what is observed in clinical practice [23].
We can also assess the timevarying effect of experiencing RM on mortality rates compared to not having RM given the same time since surgery, same age at surgery and same treatment with or without hormonal therapy. This is represented by mortality rate ratio with 95% CI over time since RM in Fig. 2. From Fig. 2 we observe that the relative effect of having RM increases rapidly with time since RM: from rate ratio 24.27 (95% CI: 18.75, 31.4) at time 0 since RM to rate ratio of 52.51 (95% CI: 41.2, 66.93) at 3.65 years since RM, followed by a rapid decline reaching 19.91 (95% CI: 8.02, 49.41) at 10 years since RM. To clarify further this comparison, the mortality rate for individuals at 10 years since RM is 19.91 times the mortality rate for individuals who never experience RM given these individuals have the same time since surgery, the same age and therapy. The model assumes that the mortality rate ratio at a certain time since RM (i.e. at certain values of \(t_2\)=\(t_1r\)) is the same irrespective of time since surgery (i.e. \(t_1\)) and time of RM (i.e. r) as long as the comparison is made between individuals having the same time since surgery. This means that we get the same mortality rate ratio for the effect of RM in comparison to no RM when, for example, \(t_1=2\), \(r=1\), and \(t_1=3\), \(r=2\), as both comparisons are at one year after RM. This comes from the fact that the model does not include an interaction term between RM and \(t_1\), nor does it include an interaction between \(t_1\) and \(t_2\) timescales. However, with more complex models that include interactions between timescales as well as timedependent effects of the covariates, the graphical representation of the mortality rate ratio would have to be displayed for different values of both timescales.
In addition to the mortality rates, we can also obtain survival proportions for different subgroups along both timescales. The 95% CI (not shown) were computed using the biascorrected percentile bootstrap method with 1000 samples. Figure 3 provides examples of predicted survival probabilities on either of the timescales. For example, panel (A) shows smooth predicted survival curves over time since surgery for patients without RM and for patients with RM at 1, 2, 3, 4, 5 years post surgery. Panel (B), on the other hand, depicts the survival proportions over time since RM. Both panels supplement each other and aid in comparison of the subgroups. In addition to the conventional representations of predicted survival in panels (A) and (B), it can be of interest to assess the survival proportions chosen for specific timepoints since RM and timepoints since surgery. Panel (C) displays the probability of surviving 1, 3 and 5 years after RM across different time points since surgery. It can be seen that the survival proportions are higher for patients with longer time since surgery given the same time since RM. For example, the probability to survive three years after RM for patients on hormonal therapy is 0.206 (95% CI: 0.142, 0.276) and 0.465 (95% CI: 0.394, 0.544) at four and eight years since surgery, respectively.
Matched cohort
In a matched cohort, when individuals with a certain disease or characteristic are matched to population controls, or comparators without the characteristics, using matching variables such as age, sex and calendar year, then attained age is often the most relevant timescale. However, time since diagnosis can also be of importance for the exposed individuals, but is not relevant for the matched comparators in the cohort. We explore this type of scenario by analysing matched cohort data of Swedish patients with myeloproliferative neoplasms (MPN). Our objective was to model mortality rates and estimate survival proportions for the MPN cases compared to population comparators over the timescales attained age and time since diagnosis.
A detailed description of the disease and patient characteristics of the MPN matched cohort are provided elsewhere [24]. Briefly, each MPN patient at age 18 years or older was matched to four individuals from the Swedish general population based on sex, age and calendar year of diagnosis during 1987  2009. Furthermore, the first 30 days of followup were excluded to avoid surveillance bias. In total the cohort consisted of 9,164 MPN cases and 35,763 matched controls with followup until death or censoring on December 31, 2010 or at maximum of 10 years of followup. In total 5,108 deaths were observed among MPN subjects, and 11,677 among nonMPN subjects (overall mortality rates were 91 and 39 per 1000 personyears, respectively).
Similar to the previous casestudy, using the FPM and Poisson methods we fitted the following hazard model,
where \(t_1\) is attained age (reference timescale), \(t_1a_0\) is time since diagnosis (second timescale) with \(a_0\) as age at MPN diagnosis, \(I_{MPN}\) is the indicator for the MPN cases, and \(I_{sex}\) is the indicator for sex (0 = Men, 1 = Women). Even though this model might not be the most relevant, in terms of covariates included, it is useful for demonstration purposes. Furthermore, it has been shown that there is no need to include all matching variables in the model in the analysis of matched cohort data if there is no additional confounding [25]. In both FPM and Poisson methods, we chose five knots for both spline functions \(g_0, v_0\), equally placed over the distribution of eventtimes, where the first knot and the last knot are placed at the first and last event times, respectively. Additionally, for the Poisson method, we split the persontime by twoday intervals, and used midpoints of the intervals when creating the splines.
Graphical representations of results for matched cohort
As in the previous casestudy, the coefficients and 95% CI displayed in Table 2 from the FPM and Poisson approaches are almost identical. The predicted mortality rates and survival proportions were the same from both methods, hence the figures that follow, represent results for attained ages 6095 from the FPM method only.
From both panels of Fig. 4, it can be seen that for the same attained age, the mortality for male MPN patients is much higher right after diagnosis than after 10 years of having the disease. For example, among men at age 80 right after the diagnosis, the mortality rate is 260 per 1000 personyears, whereas for 80 yearolds that were diagnosed 10 years ago the rate is 149 per 1000 personyears. However, this mortality rate is still higher than the mortality rate for 80 yearolds who were diagnosed five years ago (133 per 1000 personyears). Additionally, the mortality rates surface per 1000 personyears in Supplementary Fig. 2 shows a characteristic steep increase in the mortality rates with progression of time on both timescales, attained age and time since diagnosis.
After an initial decline in the first two years since diagnosis, given the same attained age and sex, the mortality rate ratio is shown to be averaging at 2.51 (95% CI: 2.33, 2.7) for MPN cases relative to nonMPN matched controls over time since diagnosis (Fig. 5). A steep initial decline in mortality after diagnosis has been observed also for other diseases, for example, diabetes [26].
Different graphical views of survival proportions are shown in Fig. 6. Panels (A) and (B) provide two alternative displays of the same survival proportions for patients with ages at diagnosis 70, 75, 80, 85 over timescales time since diagnosis and attained age, respectively. As expected, younger patients have better survival as well as women in all age groups. In panel (C), we have a crosssectional view of panel (B), where 1, 3, 5, 10year survival proportions are plotted with respect to age at diagnosis. The dramatic decrease in survival for older age groups given the same lengths of time since diagnosis is more apparent in this figure. For example, the 5year overall survival is 81.07% (95% CI: 80.09, 81.99) and 58.49% (95% CI: 57.09, 59.8) for a male patient diagnosed at age 64 and 74, respectively.
Discussion
There are situations when it is necessary to include multiple timescales in a hazard model. For example, both woman’s age and time since first childbirth have a simultaneous impact on breast cancer incidence [27] and mortality among individuals with dementia is dependent on time since onset and attained age [28], and for chronic diseases, such as diabetes mellitus, rates of complications are dependent on person’s age and duration of the disease [26]. The study design may dictate which timescales to include in the hazard model. For example, in the illnessdeath framework by allowing nonMarkov assumption, the baseline hazard can take into account both time from the initial state as well as time from the intermediate state. And, in the matched cohort design, it may be required that the hazard model includes additional timescales specific to the subgroups of the cohort. A recent study has also demonstrated how multiple timescales can be incorporated within the relative survival framework in a multistate setting [29].
An established method, such as the Poisson GLM, is a powerful tool when it comes to modeling the hazard rates with multiple timescales. However, users are required to split the dataset along the chosen timescales and make an unrealistic assumption of piecewise constant hazard rates within the timeintervals. Furthermore, with longer followup times, and larger volumes of data, it can be computationally challenging to model with a large dataset containing finely split persontime. For certain studies it might be sufficient to split along one timescale and keep track of the timeintervals on other timescales, but there can be situations when it is necessary to split along two or more timescales simultaneously, and therefore, increasing the computational burden, as well as running a higher risk of making errors when it comes to splitting the persontime along multiple timescales.
In this study we aimed to introduce and demonstrate the flexible parametric survival models that can capture complex shapes of the baseline hazard function with multiple timescales given other timescale(s) is(are) expressed in terms of the reference timescale and offsets or times of origin. With the FPM, users avoid timesplitting of the dataset as well as making an assumption of piecewise constant hazard rates. This assumption can be relaxed in a Poisson model by the use of splines or other smoothing techniques, but that requires splitting the data in more intervals for the timescales to be treated as continuous variables. We compared the FPM and the Poisson models with timesplitting in twoday intervals in both casestudies and using splines for modelling the timescales, showing that the estimates from the fitted models were nearly identical. However, in the figures we displayed the predicted hazard rates and survival proportions from the FPM only, as there was a complete overlap with the Poisson method. We also compared the results from the Poisson models with timesplitting in onemonth intervals (results not shown). The estimated coefficients were close to the coefficients from the FPM method but not as close as from the model with data split in twoday intervals. Also the estimated mortality rates did not always overlap within given timeframes. However, the survival proportions were seen to overlap.
We also demonstrated different ways of presenting the results graphically, that can help users understand the disease better. Plotting the hazard rates and survival probabilities with respect to each timescale separately for different timepoints from the other timescale can be also helpful to answer different research questions in regards to the disease of interest. The implementation of all the analyses in Stata for the model with timevarying covariate is shown in the Supplementary material.
There are limitations in this study. The confidence intervals for survival proportions were obtained using the biascorrected percentile bootstrap as estimation of survival with confidence intervals has not been implemented as part of the software package. Future work should implement easier estimation of confidence intervals with the use of the delta method [30]. Another limitation that is inherent to the FPM, is choosing knots for the spline functions. However, according to Syriopoulou et al. (2019) [31], as long as there are not too few knots, the results are robust to different choices of knots. We also did not address the question of model fitting in this study. This is the most challenging part of fitting models with multiple timescales as there are so many aspects to consider in addition to choosing covariates, such as determining which timedependent effects on which timescales to include, or whether there are interactions between the timescales. With each timescale comes an additional dimension of all the challenges that users face with onetimescale models.
In conclusion, by using the FPM, users can avoid splitting the survival data into intervals and avoid making an assumption of piecewise constant hazard rates within the timeintervals, while obtaining all the necessary estimates for inference and the graphical representation of the estimated hazard rates and probabilities. The examples shown were used to highlight the importance of fitting models with multiple timescales, and how results from these complex models can be graphically presented.
Availability of data and materials
Analysis data as well as Stata code for the model with a timevarying covariate are provided in the Supplementary material.
Abbreviations
 FPM:

Flexible parametric survival model
 H:

Proportional hazards
 RM:

Relapse or metastasis
 CI:

Confidence intervals
 MPN:

Myeloproliferative neoplasm
 GLM:

Generalised linear model
References
Efron B. Logistic Regression, Survival Analysis, and the KaplanMeier Curve. J Am Stat Assoc. 1988;83(402):414–25. https://doi.org/10.2307/2288857.
Cox DR. Regression Models and Life Tables. J R Stat Soc. 1972;B(34):187–202.
Royston P, Altman DG. Regression Using Fractional Polynomials of Continuous Covariates: Parsimonious Parametric Modelling. J R Stat Soc Ser C (Appl Stat). 1994;43(3):429–67.
Herndon JE II, Harrell FE Jr. The restricted cubic spline hazard model. Commun Stat Theory Methods. 1990;19(2):639–63. https://doi.org/10.1080/03610929008830224.
Royston P, Parmar MKB. Flexible Parametric ProportionalHazards and ProportionalOdds Models for Censored Survival Data, with Application to Prognostic Modelling and Estimation of Treatment Effects. Statistics in Medicine. 2002;21(15):2175–97. https://doi.org/10.1002/sim.1203.
Lambert PC, Royston P. Further Development of Flexible Parametric Models for Survival Analysis. Stata J. 2009;9(2):265–90. https://doi.org/10.1177/1536867X0900900206.
Farewell VT, Cox DR. A Note on Multiple Time Scales in Life Testing. J R Stat Soc Ser C (Appl Stat). 1979;28(1):73–5. https://doi.org/10.2307/2346815.
Oakes D. Multiple Time Scales in Survival Analysis. Lifetime Data Anal. 1995;1(1):7–18. https://doi.org/10.1007/BF00985253.
Korn EL, Graubard BI, Midthune D. TimetoEvent Analysis of Longitudinal Followup of a Survey: Choice of the TimeScale. Am J Epidemiol. 1997;145(1):72–80. https://doi.org/10.1093/oxfordjournals.aje.a009034.
Thiébaut ACM, Bénichou J. Choice of TimeScale in Cox’s Model Analysis of Epidemiologic Cohort Data: A Simulation Study. Stat Med. 2004;23(24):3803–20. https://doi.org/10.1002/sim.2098.
Pencina MJ, Larson MG, D’Agostino RB. Choice of Time Scale and Its Effect on Significance of Predictors in Longitudinal Studies. Stat Med. 2007;26(6):1343–59. https://doi.org/10.1002/sim.2699.
Chalise P, Chicken E, McGee D. Performance and Prediction for Varying Survival Time Scales. Commun Stat Simul Comput. 2013;42(3):636–49. https://doi.org/10.1080/03610918.2011.650259.
Holford TR. The Estimation of Age, Period and Cohort Effects for Vital Rates. Biometrics. 1983;39(2):311–24. https://doi.org/10.2307/2531004.
Keiding N. Statistical Inference in the Lexis Diagram. Philos Trans R Soc Lond Ser A Phys Eng Sci. 1990. https://doi.org/10.1098/rsta.1990.0128.
Efron B. The TwoWay Proportional Hazards Model. J R Stat Soc Ser B (Stat Methodol). 2002;64(4):899–909. https://doi.org/10.1111/14679868.00368.
Iacobelli S, Carstensen B. Multiple Time Scales in MultiState Models. Stat Med. 2013;32(30):5315–27. https://doi.org/10.1002/sim.5976.
Crowther MJ, Lambert PC. A General Framework for Parametric Survival Analysis. Stat Med. 2014;33(30):5280–97. https://doi.org/10.1002/sim.6300.
Bower H, Crowther MJ, Lambert PC. Strcs: A Command for Fitting Flexible Parametric Survival Models on the Loghazard Scale. The Stata Journal: Promoting communications on statistics and Stata. 2016;16(4):989–1012. https://doi.org/10.1177/1536867X1601600410.
Bower H, Crowther MJ, Rutherford MJ, Andersson TML, Clements M, Liu XR, et al. Capturing simple and complex timedependent effects using flexible parametric survival models: A simulation study. Commun Stat Simul Comput. 2019;1–17. https://doi.org/10.1080/03610918.2019.1634201.
Carstensen B. AgePeriodCohort Models for the Lexis Diagram. Stat Med. 2007;26(15):3018–45. https://doi.org/10.1002/sim.2764.
Foekens JA, Peters HA, Look MP, Portengen H, Schmitt M, Kramer MD, et al. The Urokinase System of Plasminogen Activation and Prognosis in 2780 Breast Cancer Patients. Cancer Res. 2000;60(3):636–43.
Sauerbrei W, Royston P, Look M. A New Proposal for Multivariable Modelling of TimeVarying Effects in Survival Data Based on Fractional Polynomial TimeTransformation. Biom J. 2007;49(3):453–73. https://doi.org/10.1002/bimj.200610328.
Dent R, Valentini A, Hanna W, Rawlinson E, Rakovitch E, Sun P, et al. Factors Associated with Breast Cancer Mortality after Local Recurrence. Curr Oncol. 2014;21(3):418–25. https://doi.org/10.3747/co.21.1563.
Hultcrantz M, Björkholm M, Dickman PW, Landgren O, Derolf ÅR, Kristinsson SY, et al. Risk for Arterial and Venous Thrombosis in Patients With Myeloproliferative Neoplasms. Ann Intern Med. 2018;168(5):317–25. https://doi.org/10.7326/M170028.
Sjölander A, Greenland S. Ignoring the Matching Variables in Cohort Studies – When Is It Valid and Why? Stat Med. 2013;32(27):4696–708. https://doi.org/10.1002/sim.5879.
Huo L, Magliano DJ, Rancière F, Harding JL, Nanayakkara N, Shaw JE, et al. Impact of Age at Diagnosis and Duration of Type 2 Diabetes on Mortality in Australia 1997–2011. Diabetologia. 2018;61(5):1055–63. https://doi.org/10.1007/s001250184544z.
Albrektsen G, Heuch I, Hansen S, Kvåle G. Breast Cancer Risk by Age at Birth, Time since Birth and Time Intervals between Births: Exploring Interaction Effects. Br J Cancer. 2005;92(1):167–75. https://doi.org/10.1038/sj.bjc.6602302.
Commenges D, Joly P, Letenneur L, Dartigues J. Incidence and Mortality of Alzheimer’s Disease or Dementia Using an IllnessDeath Model. Stat Med. 2004;23(2):199–210. https://doi.org/10.1002/sim.1709.
Weibull CE, Lambert PC, Eloranta S, Andersson TML, Dickman PW, Crowther MJ. A Multistate Model Incorporating Estimation of Excess Hazards and Multiple Time Scales. Stat Med. 2021;40(9):2139–54. https://doi.org/10.1002/sim.8894.
Hosmer DW, Lemeshow S, May S. Appendix 1: The Delta Method. In: Applied Survival Analysis. Wiley; 2008. p. 355–358. https://doi.org/10.1002/9780470258019.app1.
Syriopoulou E, Mozumder SI, Rutherford MJ, Lambert PC. Robustness of Individual and Marginal ModelBased Estimates: A Sensitivity Analysis of Flexible Parametric Models. Cancer Epidemiol. 2019;58:17–24. https://doi.org/10.1016/j.canep.2018.10.017.
Acknowledgements
Not applicable.
Funding
Open access funding provided by Karolinska Institute. This work was funded via the Swedish Cancer Society (grant number: 19 0102) and the Swedish Research Council (grant numbers: 201901965, 201900227).
Author information
Authors and Affiliations
Contributions
TA and NB designed the study. NB performed the data analysis and drafted the manuscript. HB, TA, PCL, PD, RS, ARL and MH supported the analysis, and provided revisions and improvements to the manuscript. All authors have read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
The study was approved by the Swedish Ethical Review Authority (Dnr 2017/7331; Dnr 202005339; Dnr 2005/20631/3 with amendments 2011/80932, 2013/135332 and 2014/161032). Informed consent was not required as we had no contact with the study individuals nor did our analyses contain any personal identifiers. The waived informed consent was obtained from the Swedish Ethical Review Authority for this study. All methods were performed in accordance with relevant guidelines and regulations.
Consent for publication
Not applicable.
Competing interests
NB and RS are employed by SDS Life Science and their time within this project was funded by SDS Life Science, however, SDS Life Science had no involvement in the study. TA, HB, PCL, PD, ARL, MH declare that they have no conflict of interest.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Additional file 1.
Stata .do files containing Stata code for the analysis of the model with timevarying covariates.
12874_2022_1773_MOESM2_ESM.docx
Additional file 2: Supplementary Figure 1.
12874_2022_1773_MOESM3_ESM.docx
Additional file 3: Supplementary Figure 2.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Batyrbekova, N., Bower, H., Dickman, P.W. et al. Modelling multiple timescales with flexible parametric survival models. BMC Med Res Methodol 22, 290 (2022). https://doi.org/10.1186/s12874022017739
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s12874022017739
Keywords
 Multiple timescales
 Flexible parametric survival models
 Timevarying covariate
 Matched cohort
 Cohort studies
 Epidemiological methods