 Research article
 Open Access
 Published:
Relaxing the assumption of constant transition rates in a multistate model in hospital epidemiology
BMC Medical Research Methodology volume 21, Article number: 16 (2021)
Abstract
Background
Multistate models are being increasingly used to capture complex disease pathways. The convenient formula of the exponential multistate model can facilitate a quick and accessible understanding of the data. However, assuming time constant transition rates is not always plausible. On the other hand, obtaining predictions from a fitted model with timedependent transitions can be challenging. One proposed solution is to utilise a general simulation algorithm to calculate predictions from a fitted multistate model.
Methods
Predictions obtained from an exponential multistate model were compared to those obtained from two different parametric models and to nonparametric AalenJohansen estimates. The first comparative approach fitted a multistate model with transitionspecific distributions, chosen separately based on the Akaike Information Criterion. The second approach was a RoystonParmar multistate model with 4 degrees of freedom, which was chosen as a reference model flexible enough to capture complex hazard shapes. All quantities were obtained analytically for the exponential and AalenJohansen approaches. The transition rates for the two comparative approaches were also obtained analytically, while all other quantities were obtained from the fitted models via a general simulation algorithm. Metrics investigated were: transition probabilities, attributable mortality (AM), population attributable fraction (PAF) and expected length of stay. This work was performed on previously analysed hospital acquired infection (HAI) data. By definition, a HAI takes three days to develop and therefore selected metrics were also predicted from time 3 (delayed entry).
Results
Despite clear deviations from the constant transition rates assumption, the empirical estimates of the transition probabilities were approximated reasonably well by the exponential model. However, functions of the transition probabilities, e.g. AM and PAF, were not well approximated and the comparative models offered considerable improvements for these metrics. They also provided consistent predictions with the empirical estimates in the case of delayed entry time, unlike the exponential model.
Conclusion
We conclude that methods and software are readily available for obtaining predictions from multistate models that do not assume constant transition rates. The multistate package in Stata facilitates a range of predictions with confidence intervals, which can provide a more comprehensive understanding of the data. Userfriendly code is provided.
Background
Multistate models are being increasingly used to investigate complex disease pathways, for example, when interest lies in subsequent and/or intermediate events as well as a primary event. This unified approach facilitates a better understanding of the whole disease profile and provides clinically relevant predictions, for example, transition probabilities and expected duration in each state. One example is in breast cancer, where the time to intermediate events, such as local recurrence and distant metastases, is of interest as well as overall survival [1]. Another example is repeated hospitalisations in patients with heart failure, where interest lies in the time spent in hospital (during each episode and in total) [2]. Further applications include other cancers (colorectal [3, 4], ovarian [5] and acute myeloid leukemia [6]), progression to diabetes [7], healthcare associated urinary tract infections [8] and pleural effusion following allogeneic hematopoietic stem cell transplantation [9].
Semi and nonparametric methods have been frequently used to analyse multistate models, however, interest is growing in parametric approaches. Although a variety of complex parametric models can relatively easily be fitted to each transition; the difficulty lies in obtaining the corresponding predictions from the full multistate model. Assuming an exponential Markov model allows for direct calculation of the transition probabilities, as the Kolmogorov forward equations can be solved analytically, however, assuming constant transition rates can be restrictive. Piecewise exponential models relax this assumption [10], however, discontinuous transition rates may not be biologically plausible. Another suggestion has been to model the transition rates of a Markov model with quadratic Bsplines and obtain predictions by numerically solving the Kolmogorov forward equations [11]. This paper focuses on a general simulation algorithm to obtain predictions from a range of fitted parametric models [12], including RoystonParmar models [13]. In terms of implementation, available software includes: mstate in R [14] for semi and nonparametric methods; msm in R [10] for exponential and piecewise exponential models; flexsurv in R [15] for fitting models and obtaining predictions by numerically solving the Kolmogorov forward equations; and flexsurv in R [15] or multistate in Stata [12] for the general simulation algorithm, the latter following model fitting by merlin [16, 17].
Von Cube et al. [18] recommended the exponential model as an accessible approach to obtain a quick, general understanding of the data. The authors demonstrated this method on hospital acquired infection (HAI) data: an extended illnessdeath model where a patient can have a HAI (intermediate event) and then/or be discharged or die (competing risks, the death/discharge with HAI are distinct from those without, resulting in 6 states). Von Cube et al. [18] acknowledged the potential implausibility of time constant transition rates (saying that this assumption is rarely met in practice) and recommended more sophisticated methods if the assumption was violated.
In this paper, we compare the method described in von Cube et al. [18] to the approach described in Crowther and Lambert [12]. We demonstrate that there is accessible software  namely the multistate package in Stata [12]  to obtain transition probabilities from multistate models with transitionspecific (and timedependent) distributions. We then extend the previous analysis [18] by presenting, and highlighting the importance of, uncertainties and by estimating length of stay.
This paper is organised as follows: the “Methods” section introduces the multistate process, illustrative example, metrics of interest and analysis approaches. The “Results” section displays the quantities of interest graphically, including comparisons between the approaches and confidence intervals for the predictions. The “Discussion” section provides recommendations for future analyses. Userfriendly code for the illustrative example is provided in Additional file 1.
Methods
Multistate models
Following Fiocco et al. [19], consider a stochastic process Y(t),t≥0 with a finite state space \( \mathcal {Z} = \{1,\ldots,Z\} \) and process history up to time \( s, \mathcal {H}_{s} = \{ Y(u); 0 \leq u \leq s \} \). The transition probabilities can then be defined as:
This is the probability that a patient in state a at time s moves to state b by time t, conditional on the process history up until the time just before \( s, \mathcal {H}_{s} \), where \( a, b \in \mathcal {Z} \). This can be simplified to a Markov model, which makes the assumption that the probability in Eq. 1 is only conditional on the state at time s and no other process history:
Henceforth, let P_{ab}(s,t) represent the transition probability given in Eq. 2. This paper focuses on Markov models.
The transition rate, or transition hazard, from state a to state b at time t is:
This represents the instantaneous failure rate of moving from state a to b and is analogous to the hazard function in the standard survival setting. The collection of transition rates governs the rate at which patients move between states and therefore the multistate model. For a tutorial in multistate models see Putter et al. [20].
Another useful measure is the restricted length of stay in a state. This is analogous to restricted mean survival in the standard survival setting [21]. The residual, restricted expected length of stay in state b given a patient is in (nonabsorbing) state a at time s is:
See Grand and Putter [22] for more details on expected length of stay.
The extended illnessdeath model for HAIs
This paper considers a multistate model in the context of hospital acquired infections (HAIs), as previously described by von Cube et al. [18]. When a patient is admitted to hospital, they are at risk of acquiring a HAI, which could lead to an increased hospital stay or increased risk of (hospital) death. An extended illnessdeath model with six states and five transitions, as illustrated in Fig. 1, has been used to investigate the risks and consequences of HAIs. The time scale is days since hospital admission. All patients begin in state 1 at time 0, where the patient has been admitted to hospital but does not have an infection. The patient will then either become infected (state 2), be discharged without an infection (state 3) or die without an infection (state 4). If the patient acquires an infection, they will then either be discharged (state 5) or die (state 6) with an infection. The i^{th} transition rate from state a_{i} to b_{i} has been denoted as h_{i}(t) where:
Metrics of interest
The first metric of interest was transition probabilities from state 1 at time 0, P_{1b}(0,t),b={1,2,3,4,5,6}. By definition, HAIs take at least three days to develop [18] and so there were no HAI events prior to time 3 (3 days after hospital admission). Therefore, transition probabilities from state 2 at time 3, P_{2b}(3,t),b={2,5,6}, were also estimated.
Following the formulas from Schumacher et al. [23] and von Cube et al. [18], let P_{1D}(s,t) denote the probability of dying (states 4 or 6) by time t, given a patient was in hospital without a HAI (state 1) at time s. Let P_{16+}(s,t) denote the probability of dying with a HAI (state 6) by time t, given a patient was in hospital without a HAI (state 1) at time s and had become infected (states 2, 5 or 6) by time t. Similarly, let P_{14−}(s,t) denote the probability of dying without a HAI (state 4) by time t, given a patient was in hospital without a HAI at time s and in a noninfected state (states 1, 3 or 4) at time t. The quantities can be calculated as follows:
The second set of metrics of interest were attributable mortality (AM) and population attributable fraction (PAF). AM and PAF can be used to investigate the excessive risk of dying due to HAIs, see references for discussion [18, 23].
AM and PAF were estimated from time 0 (i.e. s=0 in Eqs. 3 and 4).
The third metric of interest was expected length of stay. Given a patient started in state 1 at time 0, the following quantities were estimated: restricted length of stay in state 1 (e_{11}(0,t)), in state 2 (e_{12}(0,t)) and overall in hospital (e_{11}(0,t)+e_{12}(0,t)). Due to the three day delay in developing a HAI, the residual, restricted length of stay in state 2, conditional on having entered state 2 by time 3, was also of interest (e_{22}(3,t)). These were calculated up until the last event time (from any transition).
Analysis approaches
The three (set of) metrics were obtained from three models. The predictions from the different models were compared against each other and against nonparametric AalenJohansen estimates. Note that we use the terms “AalenJohansen” and “empirical” interchangeably to refer to the nonparametric estimates. The AalenJohansen estimator generalises the KaplanMeier estimator to Markov multistate processes. The metrics are obtained analytically, see Additional file 2 for brief details or the following references for more comprehensive details [24–26].
The first approach was to fit an exponential model to each of the transitions and obtain the metrics analytically, as was demonstrated by von Cube et al. [18]. This approach was referred to as the “Exp” model. The parametrisation of all three models is presented in Additional file 2.
The second approach was to select the best fitting distribution for each transition based on the Akaike Information Criterion (AIC), henceforth denoted the “AIC” model. Following Crowther and Lambert [12], to each transition the following parametric models were applied: exponential, Weibull, Gompertz, loglogistic, lognormal, generalised gamma and RoystonParmar models [13] with 2 to 5 degrees of freedom. The confidence intervals for the transition rates were obtained using the delta method. Once the multistate model was fitted analytically, with the best fitting distribution for each transition, the general simulation algorithm was applied to obtain the metrics, see the next section for more details.
The third approach was to fit a RoystonParmar model with 4 degrees of freedom to each of the transitions, henceforth denoted the “RP(4)” model. This was chosen as a reference parametric model for comparison purposes, as it should have sufficient flexibility to capture most complex hazard shapes. A recent sensitivity analysis of RoystonParmar models [27] suggested that 4 degrees of freedom can adequately capture the baseline hazard. The analysis was performed in a relative survival setting; however, the conclusions can be applied to standard survival [27]. Once the multistate model was fitted analytically, the general simulation algorithm was applied to obtain the metrics.
Simulation algorithm
The simulation algorithm works by projecting a patient through the multistate model in order to create their full event history. This is done a large number of times and the metrics of interest are calculated empirically from the large complete set of histories. The process will now be described in brief, for further details see Crowther and Lambert [12]. Let a be the starting state, entered at time t_{a}. If desired, specify a maximum followup time t_{max}. For each simulated patient, following the algorithm of Fiocco et al. [19] and Crowther and Lambert [12], repeat the following:

1
Let \( \mathcal {B} \) be the set of states that can be reached from state a and let N_{a} be the cardinality of set \( \mathcal {B} \). If N_{a}=0 (i.e. a is an absorbing state), stop. Otherwise, for each state \( b \in \mathcal {B} \), let h_{ab}(t) represent the transition rate from a→b.

2
For each state \( b \in \mathcal {B} \), use h_{ab}(t) to simulate event times \( t^{*}_{ab} \) conditional on entering state a at time t_{a}. Event times are simulated using the general inversion method described in Crowther and Lambert [28].

3
The observed event time is then \( t^{*} = min\left \{t^{*}_{a1},\ldots,t^{*}_{aN_{a}},t_{max}\right \} \). If t^{∗}=t_{max}, stop.

4
Set a=c where \( t^{*} = t^{*}_{ac}, c \in \mathcal {B} \) and set t_{a}=t^{∗}.
The algorithm above is repeated for a large N number of patients. The transition probabilities are then estimated by calculating the proportion of simulated patients in each state at each time point of interest. The full event history of the simulated patients is known and therefore extended predictions can easily be obtained. For example, expected length of stay can be calculated by averaging the time spent in each state (up to each time point of interest) over all patients.
Let \( \hat {\mathbf {b}} \) be the vector of parameter estimates and \( \hat {\mathbf {V}} \) be the variancecovariance matrix from the fitted multistate model (note that it is \( \hat {\mathbf {b}} \) that is used to obtain the transition rates h_{ab}(t) in the algorithm above). Confidence intervals can be obtained by drawing from a multivariate normal distribution with mean \( \hat {\mathbf {b}} \) and variance \( \hat {\mathbf {V}} M \) times [12, 19]. For each draw m, the simulation algorithm above is repeated using the sampled \({\hat {\mathbf {b}}_{m}}\) instead of \( \hat {\mathbf {b}} \) to calculate the transition rates (and therefore event times). The variance of the M sets of estimates is then calculated and used to produce confidence intervals via normal approximation.
When the general simulation approach was utilised, 1000000 simulated patients were used (N=1000000) for the point estimates. 100000 simulated patients (N=100000) were repeated 500 times (M=500) for the corresponding confidence intervals. These values were chosen as a balance to minimise the Monte Carlo error and computational time. To produce the confidence intervals for the transition probabilities from the “AIC” model, where probabilities were calculated at 165 equally spaced time points, took 48.5 minutes on a standard HP laptop with i5 processor and 8 GB of RAM.
Software
All analyses were performed in Stata version 15.1 and the code can be found in Additional file 1. The parametric transition rates were obtained using merlin, version 1.12.0 dated 20/09/2020. All predictions and confidence intervals obtained via the general simulation algorithm were achieved using predictms, version 4.0.0 dated 28/10/2020, from the multistate package. The AalenJohansen estimates were obtained using msaj, version 1.0.1 dated 11/09/2020, also part of the multistate package.
Results
Data
The analysis was performed on the publicly available los.data from the R package etm [24]. This is a sample from an observational cohort study conducted to analyse the burden of HAIs in intensive care, see Beyersmann et al. [29] for details. 756 patients were admitted to hospital (all patients started in state 1 at time 0). 632 patients did not acquire an infection during the study, of which 475 were discharged and 157 died. 124 patients did acquire an infection, of which 90 were discharged and 34 died. There was no censoring in this sample and the last event occurred 82 days after admission.
Transition rates
Table 1 gives the AIC for each distribution fitted to each transition separately. The AIC indicated that the following models gave the best fit for each transition and were therefore chosen for the “AIC” model:

1
Transition 1: RoystonParmer model with 4 degrees of freedom.

2
Transition 2: Generalised gamma model.

3
Transition 3: RoystonParmer model with 4 degrees of freedom.

4
Transition 4: Lognormal model.

5
Transition 5: Generalised gamma model.
Figure 2 illustrates the transition rates from the “AIC” model. The point estimates and confidence intervals are shown from the time of the first event until the last event for each transition by a solid line. The corresponding intervals were [3,40],[3,82],[3,69],[5,78] and [6,54] for transitions 15, respectively. The point estimates were extrapolated to cover the interval [0,82] with a dashed line. It was evident that the transition rates were not constant over time and transition 2 (admission to discharge without HAI) appeared to deviate most drastically from this assumption. Additional file 3: Figure S1 compares the transition rates from the “AIC” model to the “Exp” model, “RP(4)” model and to nonparametric estimates obtained using the Epanechnikov kernel smoother. The smoothed nonparametric estimates varied depending on the kernel type and bandwidth used, however, in all cases, the transition rates were clearly not constant.
Transition probabilities
Figure 3 compares the transition probability estimates from the three approaches with empirical estimates (starting in state 1 at time 0). The predictions from the “AIC” and “RP(4)” models had high concordance with the AalenJohansen estimates. As von Cube et al. noted [18], despite clear departures from the constant transition rates assumption, the “Exp” model performed well for states 4, 5 and 6 (death without HAI, discharge with HAI and death with HAI). There were some discrepancies with states 1, 2 and 3 (hospital admission without HAI, with HAI and discharge with HAI) up to 30 days after admission. Importantly, the predictions obtained from the “AIC” and “RP(4)” models captured the three day delay in acquiring a HAI (and in fact the delay in experiencing any event, as the minimum event time was 3 days since admission), which the “Exp” model could not capture. Where estimates are shown comparing approaches in the main body of the paper, Additional file 3: Figures S2, S5 and S6 show the corresponding graphs for the “AIC” model with confidence intervals.
Additional file 3: Figure S3 illustrates the transition probabilities conditional on being in state 2 (in hospital with a HAI) by time 3. Predictions from the “AIC” and “RP(4)” models were slightly more consistent with the empirical estimates than those from the “Exp” model (especially in the first 20 days). Additional file 3: Figure S4 is the corresponding graph for the “AIC” model with confidence intervals.
Attributable mortality and population attributable fraction
About 2025 days after hospital admission, AM was above 0 for the “AIC” and “RP(4)” models. This suggested that after 25 days, the probability of dying was greater for those with an infection than for those without. AM can be interpreted as, for example: an individual that acquired a HAI by time 10 had a 4.7 percentage point decreased probability of dying by time 10 compared to an individual who did not acquire a HAI. Alternatively, an individual that acquired a HAI by time 30 had a 1.4 percentage point increased probability of dying by time 30. These results were similar for PAF, suggesting that after 25 days the occurrence of a HAI increased the risk of death and therefore the overall probability of dying. PAF can be interpreted as, for example: the proportion of individuals dying by time 10 would have increased by 4.2% if there were no HAIs. Alternatively, the proportion of individuals dying by time 30 would have decreased by 0.9% if there were no HAIs (all predictions from the “AIC” model). Schumacher et al. [23] describes the phenomenon of the AM and PAF initially being lower than 0.
The relative differences in predictions between the models were greatest for AM and PAF, see Fig. 4 (although the absolute differences were small). The “Exp” model appeared to overestimate (until 15 days after admission) and then underestimate AM and PAF considerably; whereas both “AIC” and “RP(4)” predictions appeared to approximate the empirical estimates well. There was a slight inconsistency with the “AIC” model and the empirical estimates towards the end of the time window, however, this should not be overinterrupted due to the small number of events occurring past 50 days.
Length of stay
Figure 5 illustrates the restricted expected length of stay for the three models with empirical estimates (starting in state 1 at time 0). The graph can be interpreted as follows: 82 days since hospital admission, on average a patient spent 8.68 (95% CI 8.04,9.39) days in hospital without a HAI and 1.98 (95% CI 1.53,2.58) days in hospital with a HAI (estimates taken from the “AIC” model). The empirical estimates were slightly better approximated by the “AIC” and “RP(4)” models (especially in the first 20 days). The confidence intervals for total hospital stay for the “AIC” model are shown in Additional file 3: Figure S7.
Figure 6 illustrates the residual, restricted expected length of stay, conditional on being in state 2 (in hospital with a HAI) by time 3. Between the interval [3,82] days, a patient would have spent on average 13.61 (95% CI 11.30,16.26) days in hospital with a HAI, given they had a HAI and were in hospital by time 3 (estimates taken from the “AIC” model). While predictions obtained from the “AIC” and “RP(4)” models were consistent with empirical results, Fig. 6 shows a large discrepancy between the latter and the “Exp” model. In the context of health economics, such differences could be nontrivial when translated into costs.
Discussion
Assuming constant transition rates in a multistate model can facilitate a basic understanding of the data and this approach has been well demonstrated by von Cube et al. [18]. However, this assumption may not always be plausible and, as a result, predictions may be misspecified. In the example shown, despite the transition rates not being constant (Fig. 2), the transition probabilities from the “Exp” approach were similar to the AalenJohansen estimates. However, this was not the case for some functions of the transition probabilities, for example, AM and PAF (Fig. 4). In addition, predictions from the “Exp” model starting in state 2 at time 3 had poorer concordance with the nonparametric estimates (Additional file 3: Figure S3 and Fig. 6).
This paper compared the “Exp” model to two parametric approaches, where predictions from the latter were obtained from the fitted model via a general simulation algorithm. The “AIC” and “RP(4)” predictions were more consistent with the AalenJohansen estimates than the “Exp” model for all metrics. The greatest improvements were seen in AM and PAF and when considering delayed entry (predictions starting from state 2 at time 3).
As with any parametric approach, assumptions need to be made regarding the most appropriate distribution for each transition. A balance needs to be sought in terms of parsimony and sufficient parameters to appropriately capture the hazard shapes. This work has highlighted the challenges of model selection as the “AIC” model (i.e. the collection of the best fitting distributions for each transition in terms of AIC) did not always have better concordance with the nonparametric estimates compared to the reference “RP(4)” model. Regardless of approach, we would always recommend sensitivity analyses around the assumptions of the baseline hazard. It is important to note that both approaches still performed better than the “Exp” model. For this data example, a conditional parametric model would have been more appropriate for any transitions that could not have happened before day 3 (by definition or design of the study). We chose not to consider a conditional model to be consistent with, and allow easier comparison with, the motivating paper by von Cube et al. [18].
In addition to being able to model the transitions with a range of parametric distributions, the general simulation algorithm has other advantages. It easily lends itself to extended predictions, such as length of stay, the probability of ever visiting a state and disease specific quantities, such as AM and PAF. Uncertainties can easily be obtained, which can facilitate a more comprehensive understanding of the data. There is also great flexibility available when modelling covariate effects, including timedependent effects [12]. The approach generalises to more complex multistate models, i.e. models with a greater number of states, greater number of transitions and backward transitions. It can also be applied to nonMarkov models, unlike methods that rely on solving the Kolmogorov forward equations to obtain predictions.
A disadvantage of the general simulation algorithm is computational time. Although point estimates can be obtained relatively quickly, confidence intervals can require a considerable amount of time, especially in the case of more complicated userdefined functions. A balance between computational time and Monte Carlo error is therefore needed when choosing N (number of simulations) and M (number of repetitions for confidence intervals). One possible alternative would be to use a hybrid approach when calculating predictions, where the transition rates obtained through parametric methods are substituted into the nonparametric AalenJohansen estimator [15, 30]. This approach would greatly decrease computational time, however, is only applicable to Markov models.
Conclusion
We conclude that methods and software are readily available for obtaining predictions from multistate models that do not assume constant transition rates. The multistate package in Stata facilitates a range of predictions with confidence intervals obtained from a fitted multistate model via a general simulation algorithm. We agree that assuming constant transition rates can provide a quick, basic understanding of the data; however, we recommend a more sophisticated parametric approach for a comprehensive understanding that includes uncertainty.
Abbreviations
 AIC:

Akaike information criterion
 AM:

Attributable mortality
 HAI:

Hospital acquired infection
 PAF:

Population attributable fraction
References
 1
Putter H, van der Hage J, de Bock GH, Elgalta R, van de Velde CJ. Estimation and prediction in a multistate model for breast cancer. Biom J. 2006; 48(3):366–80.
 2
Ieva F, Jackson CH, Sharples LD. Multistate modelling of repeated hospitalisation and death in patients with heart failure: The use of large administrative databases in clinical epidemiology. Stat Methods Med Res. 2017; 26(3):1350–72.
 3
Manzini G, Ettrich TJ, Kremer M, Kornmann M, HenneBruns D, Eikema DA, Schlattmann P, de Wreede LC. Advantages of a multistate approach in surgical research: How intermediate events and risk factor profile affect the prognosis of a patient with locally advanced rectal cancer. BMC Med Res Methodol. 2018; 18:23.
 4
GilardPioc S, Abrahamowicz M, Mahboubi A, Bouvier AM, Dejardin O, Huszti E, Binquet C, Quantin C. Multistate relative survival modelling of colorectal cancer progression and mortality. Cancer Epidemiol. 2015; 39(3):447–55.
 5
Eulenburg C, Mahner S, Woelber L, Wegscheider K. A systematic model specification procedure for an illnessdeath model without recovery. PLOS ONE. 2015; 10(4):e0123489.
 6
LeRademacher JG, Peterson RA, Therneau TM, Sanford BL, Stone RM, Mandrekar SJ. Application of multistate models in cancer clinical trials. Clin Trials. 2018; 15(5):489–98.
 7
Nazari M, Nazari SH, Zayeri F, Dehaki MG, Baghban AA. Estimating transition probability of different states of type 2 diabetes and its associated factors using Markov model. Prim Care Diabetes. 2018; 12(3):245–53.
 8
Mitchell BG, Ferguson JK, Anderson M, Sear J, Barnett A. Length of stay and mortality associated with healthcareassociated urinary tract infections: A multistate model. J Hosp Infect. 2016; 93(1):92–99.
 9
Lee J, Modi D, Jang H, Uberti JP, Kim S. Multistate models on pleural effusion after allogeneic hematopoietic stem cell transplantation. Open Access Med Stat. 2017; 7:15–26.
 10
Jackson CH. Multistate models for panel data: The msm package for R. J Stat Softw. 2011; 38(8):1–29.
 11
Titman AC. Flexible nonhomogeneous Markov models for panel observed data. Biometrics. 2011; 67(3):780–7.
 12
Crowther MJ, Lambert PC. Parametric multistate survival models: Flexible modelling allowing transitionspecific distributions with application to estimating clinically useful measures of effect differences. Stat Med. 2017; 36(29):4719–42.
 13
Royston P, Parmar MK. Flexible parametric proportionalhazards and proportionalodds models for censored survival data, with application to prognostic modelling and estimation of treatment effects. Stat Med. 2002; 21(15):2175–97.
 14
de Wreede LC, Fiocco M, Putter H. mstate: An R package for the analysis of competing risks and multistate models. J Stat Softw. 2011; 38(7):1–30.
 15
Jackson CH. flexsurv: A platform for parametric survival modeling in R. J Stat Softw. 2016; 70(8):1–33.
 16
Crowther MJ. merlin  a unified modelling framework for data analysis and methods development in Stata. Stata J. 2020; 20(4):763–84.
 17
Crowther MJ. Extended multivariate generalised linear and nonlinear mixed effects models. 2017. Preprint at https://arxiv.org/abs/1710.02223. Accessed 11 Jan 2020.
 18
von Cube M, Schumacher M, Wolkewitz M. Basic parametric analysis for a multistate model in hospital epidemiology. BMC Med Res Methodol. 2017; 17:111.
 19
Fiocco M, Putter H, van Houwelingen HC. Reducedrank proportional hazards regression and simulationbased prediction for multistate models. Stat Med. 2008; 27(21):4340–58.
 20
Putter H, Fiocco M, Geskus RB. Tutorial in biostatistics: Competing risks and multistate models. Stat Med. 2007; 26(11):2389–430.
 21
Royston P, Parmar MKB. The use of restricted mean survival time to estimate the treatment effect in randomized clinical trials when the proportional hazards assumption is in doubt. Stat Med. 2011; 30(19):2409–21.
 22
Grand MK, Putter H. Regression models for expected length of stay. Stat Med. 2016; 35(7):1178–92.
 23
Schumacher M, Wangler M, Wolkewitz M, Beyersmann J. Attributable mortality due to nosocomial infections. A simple and useful application of multistate models. Methods Inf Med. 2007; 46(5):595–600.
 24
Allignol A, Schumacher M, Beyersmann J. Empirical transition matrix of multistate models: The etm package. J Stat Softw. 2011; 38(4):1–15.
 25
Aalen OO, Johansen S. An empirical transition matrix for nonhomogeneous Markov chains based on censored observations. Scand J Stat. 1978; 5:141–50.
 26
Andersen PK, Borgan O, Gill RD, Keiding N. Statistical models based on counting processes. New York: SpringerVerlag; 1993.
 27
Syriopoulou E, Mozumder SI, Rutherford MJ, Lambert PC. Robustness of individual and marginal modelbased estimates: A sensitivity analysis of flexible parametric models. Cancer Epidemiol. 2019; 58:17–24.
 28
Crowther MJ, Lambert PC. Simulating biologically plausible complex survival data. Stat Med. 2013; 32(23):4118–34.
 29
Beyersmann J, Gastmeier P, Grundmann H, Bärwolff S, Geffers C, Behnke M, Rüden H, Schumacher M. Use of multistate models to assess prolongation of intensive care unit stay due to nosocomial infection. Infect Control Hosp Epidemiol. 2006; 27(5):493–9.
 30
Cook RJ, Lawless JF. Statistical issues in modeling chronic disease in cohort studies. Stat Biosci. 2014; 6:127–61.
Acknowledgements
Not applicable.
Funding
MJC was funded by a MRC New Investigator Research Grant, Grant/Award Number: MR/P015433/1. PCL was funded by the Swedish Research Council [201701591]. The funders had no role in data collection and analysis, decision to publish or preparation of the manuscript.
Author information
Affiliations
Contributions
MH performed the data analysis, wrote the Stata code and drafted the manuscript. PCL and MJC supported the analysis and provided revisions and improvements to the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Additional file 1
Stata code for predictions. A Word document (.docx) with Stata code to analyse the HAI data. It formats the los.data dataset (once imported from the R package etm), fits the parametric models and obtains the predictions from the different approaches (AalenJohansen estimates, “Exp” model, “AIC” model and “RP(4)” model). The predictions include all those calculated in this paper: transition probabilities, AM and PAF and expected length of stay.
Additional file 2
Further details of the analysis approaches. A PDF giving details about how the AalenJohansen estimates were calculated and showing the functional forms of the transition rates for the three parametric approaches.
Additional file 3
Additional figures. A PDF including supplementary figures from the HAI data analysis.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visithttp://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Hill, M., Lambert, P.C. & Crowther, M.J. Relaxing the assumption of constant transition rates in a multistate model in hospital epidemiology. BMC Med Res Methodol 21, 16 (2021). https://doi.org/10.1186/s12874020011928
Received:
Accepted:
Published:
Keywords
 Multistate models
 Competing risks
 Survival analysis
 Markov processes
 Transition probabilities
 Expected length of stay
 Software