- Technical advance
- Open Access
- Open Peer Review
Identification of risk factors for hospital admission using multiple-failure survival models: a toolkit for researchers
© Westbury et al. 2016
- Received: 2 December 2015
- Accepted: 18 March 2016
- Published: 26 April 2016
The UK population is ageing; improved understanding of risk factors for hospital admission is required. Linkage of the Hertfordshire Cohort Study (HCS) with Hospital Episode Statistics (HES) data has created a multiple-failure survival dataset detailing the characteristics of 2,997 individuals at baseline (1998–2004, average age 66 years) and their hospital admissions (regarded as ‘failure events’) over a 10 year follow-up. Analysis of risk factors using logistic regression or time to first event Cox modelling wastes information as an individual’s admissions after their first are disregarded. Sophisticated analysis techniques are established to examine risk factors for admission in such datasets but are not commonly implemented.
We review analysis techniques for multiple-failure survival datasets (logistic regression; time to first event Cox modelling; and the Andersen and Gill [AG] and Prentice, Williams and Peterson Total Time [PWP-TT] multiple-failure models), outline their implementation in Stata, and compare their results in an analysis of housing tenure (a marker of socioeconomic position) as a risk factor for different types of hospital admission (any; emergency; elective; >7 days). The AG and PWP-TT models include full admissions histories in the analysis of risk factors for admission and account for within-subject correlation of failure times. The PWP-TT model is also stratified on the number of previous failure events, allowing an individual’s baseline risk of admission to increase with their number of previous admissions.
All models yielded broadly similar results: not owner-occupying one’s home was associated with increased risk of hospital admission. Estimated effect sizes were smaller from the PWP-TT model in comparison with other models owing to it having accounted for an increase in risk of admission with number of previous admissions. For example, hazard ratios [HR] from time to first event Cox models were 1.67(95 % CI: 1.36,2.04) and 1.63(95 % CI:1.36,1.95) for not owner-occupying one’s home in relation to risk of emergency admission or death among women and men respectively; corresponding HRs from the PWP-TT model were 1.34(95 % CI:1.15,1.56) for women and 1.23(95 % CI:1.07,1.41) for men.
The PWP-TT model may be implemented using routine statistical software and is recommended for the analysis of multiple-failure survival datasets which detail repeated hospital admissions among older people.
- Medical statistics
- Epidemiological methods
- Cohort studies
- Hospital admissions
- Survival analysis
- Risk factor
- Older people
The UK population is ageing ; improved understanding of lifecourse risk factors for hospital admission is required to identify subgroups of the population who are at increased risk of hospital admission, and to inform the development of intervention strategies to delay or prevent admissions to hospital .
Recent linkage between the Hertfordshire Cohort Study (HCS) database and routinely collected Hospital Episode Statistics (HES) data has yielded a complex dataset which comprises baseline information on socio-demographic, lifestyle and clinical characteristics of 2,997 community-dwelling men and women (average age 66 years at baseline 1998–2004) together with details of all inpatient hospital admissions over a 10 year follow-up period . HCS is the first UK birth cohort study to link with HES data but other well established UK cohorts [4, 5] have the potential to do so. Cohort study databases that have been linked with HES data are a rich resource for the investigation of risk factors for hospital admission among older men and women but require sophisticated statistical analysis techniques if they are to be fully explored.
A dataset which contains information about hospital admission histories for study participants may be referred to as a ‘multiple-failure survival dataset’. In this context, a hospital admission is regarded as a ‘failure event’; study participants may experience none, one, or many failure events during the study follow-up period. Statistical analysis techniques for multiple-failure survival datasets are well established [6, 7] but little applied in medical research owing to their complexity. We are not aware of any previous publications that have used multiple-failure survival analysis techniques to analyse risk factors for hospital admission among community-dwelling older people in the UK.
The objectives of this paper are to provide researchers with a ‘toolkit’ for the analysis of multiple-failure survival datasets by: reviewing suitable statistical analysis techniques; outlining their implementation using the Stata statistical software package; and contrasting the application of these techniques to an analysis of the association between housing tenure, a marker of socioeconomic position, and different types of hospital admission in the linked HCS-HES dataset.
Application of techniques to an analysis of the relationship between housing tenure and risk of hospital admission in the HCS-HES dataset
Data were described using means and standard deviations (SD), medians and inter-quartile ranges (IQR) and frequency and percentage distributions. The association between housing tenure and risk of hospital admission or death was analysed using the following techniques: time to first event Cox regression; the Andersen and Gill (AG) model; and the Prentice, Williams and Peterson Total Time (PWP-TT) model. Analyses were conducted without and with adjustment for age, height, weight adjusted for height, smoking history, alcohol, and walking speed. We analysed different types of hospital admission: any admission; emergency admission; elective admission and long admission (greater than 7 days). All analyses were conducted for men and women separately using the syntax commands presented in Table 3 and using release 13.0 of the Stata statistical software package .
Characteristics of HCS participants
Characteristics of HCS participants
Men (n = 1579)
Women (n = 1418)
Ever smoked regularly
1059 (67.1 %)
553 (39.0 %)
High alcohol intake (≥22 M; ≥15 F units per week)
340 (21.5 %)
68 (4.8 %)
Home ownership (Not owned or mortgaged)
299 (18.9 %)
313 (22.1 %)
Walking speed (self-reported)a:
76 (4.8 %)
97 (6.8 %)
375 (23.8 %)
285 (20.1 %)
625 (39.6 %)
638 (45.0 %)
432 (27.4 %)
319 (22.5 %)
69 (4.4 %)
79 (5.6 %)
Number of systems medicatedb
1.0 (0.0, 2.0)
1.0 (1.0, 2.0)
189 (12.0 %)
86 (6.1 %)
Ever had an admission
1185 (75.0 %)
976 (68.8 %)
Ever had an admission/died
1197 (75.8 %)
985 (69.5 %)
Associations between housing tenure and risk of hospital admission
Associations between housing tenure and risk of hospital admission for different types of survival models
(95 % CI)
(95 % CI)
(95 % CI)
(95 % CI)
Long admission (>7 days)
In spite of obtaining broadly similar conclusions about the pattern of association between housing tenure and risk of hospital admission or death from all survival analysis techniques, the hazard ratios estimated by the PWP-TT model were smaller than those from the time to first event Cox model and the AG model. For example, the time to first event Cox model estimated hazard ratios of 1.67 (95 %CI: 1.36, 2.04) and 1.63 (95 %CI: 1.36, 1.95) for the association between not owner-occupying one’s home and the risk of emergency admission or death among women and men respectively; the corresponding hazard ratios as estimated by the PWP-TT model were only 1.34 (95 %CI: 1.15, 1.56) for women and 1.23 (95 %CI 1.07, 1.41) for men.
Linkage between the HCS database and HES data has created a rich but complex multiple-failure survival dataset for the investigation of risk factors for hospital admission among older people; other UK cohorts are well placed to link with HES. This paper serves as a ‘toolkit’ to assist researchers in the appropriate analyses of multiple-failure survival datasets by: reviewing suitable analysis techniques; outlining their implementation using Stata; and contrasting their application in an indicative analysis of housing tenure as a socioeconomic risk factor for hospital admission. We recommend the Prentice, Williams and Peterson Total Time (PWP-TT) model for the analysis of multiple-failure survival datasets which detail hospital admissions among older people.
Our observation that the PWP-TT model gives smaller estimated hazard ratios than the time to first event Cox or Andersen and Gill models is consistent with previous research which investigated risk factors for hospital readmission in Brazil . The PWP-TT model is likely to yield more conservative hazard ratios because it accounts for the underlying increase in risk of admission with the number of accumulated previous admissions. Failure to account for an increase in this underlying risk of admission may result in exaggerated estimates of the impact of a risk factor on hospital admission.
This paper has some limitations. First, we regarded hospital admission and death as equivalent failure events. This approach was necessitated because death cannot simply be regarded as a non-informative censoring event such as emigration or the end of follow-up . Moreover, although competing risk regression , as an extension of time to first event Cox modelling, could account for deaths as a competing event (an event which occurs instead of the failure event of interest) and would be important to consider in a time to first event analysis of nursing home admission among elderly people where the risk of mortality is high, this approach is not extendable to multiple-failure survival datasets using routine statistical software. Second, our review of suitable techniques for multiple-failure survival datasets was focused on those that may be implemented using routine statistical software. Alternative techniques not discussed in this paper include; multi-state models, which investigate the relationship between individual risk factors and the transition probabilities between states representing different failure events ; frailty models, which are similar to a Cox’s proportional hazards model but include random effects to account for the within-subject correlation of failure times ; and the Wei, Lin and Weissfeld Model (WLW) model which has similarities to the AG and PWP-TT model but is poorly suited to the analysis of ordered failure events because the individual is regarded at risk of all repeated events from the outset .
This paper also has many strengths. First, we provide researchers with a comprehensive ‘toolkit’ for the analysis of multiple-failure survival datasets arising from linkage between cohort study datasets and routinely collected data on hospital admissions. We describe all stages of statistical analysis from the appropriate organisation of the dataset, to an understanding of the key properties of available analysis techniques and their implementation in Stata, through to a comparison of results from an indicative analysis of risk factors for hospital admission. This paper is a valuable resource which will enable researchers to apply complex multiple-failure survival analysis techniques in their own research. Second, our indicative analysis of the association between housing tenure and hospital admission used data from a well characterised cohort study of community-dwelling older men and women; data were collected by trained research doctors and nurses according to strict measurement protocols . We therefore have confidence in the broad conclusion that not owner-occupying one’s home, an indicator of socioeconomic disadvantage, is associated with increased risk of hospital admission and this is consistent with the wide evidence base for a social gradient in health [28, 29].
We recommend the Prentice, Williams and Peterson Total Time model for the analysis of multiple-failure survival datasets which detail hospital admissions among older people. This article serves as a toolkit to assist researchers in the appropriate analysis of multiple-failure survival datasets arising from data linkage between a cohort study and routinely collected data on hospital admissions.
Availability of supporting data
We welcome opportunities for collaboration. Enquiries should be directed to Professor Cyrus Cooper, Director of the MRC Lifecourse Epidemiology Unit and Hertfordshire Cohort Study Principal Investigator, University of Southampton (email@example.com).
This work was supported by the Medical Research Council [MC_UP_A620_1015, MC_UU_12011/2] and University of Southampton UK.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Office for National Statistics. Population ageing in the United Kingdom its constituent countries and the European Union. London: Office for National Statistics. 2012. www.ons.gov.uk/ons/dcp171776_258607.pdf. Accessed 25 Nov. 2015.
- House of Lords Select Committee on Public Service and Demographic Change. Ready for ageing. HL paper 140, Stationary office. 2013.Google Scholar
- Simmonds SJ, Syddall HE, Walsh B, Evandrou M, Dennison EM, Cooper C, et al. Understanding NHS hospital admissions in England: linkage of Hospital episode statistics to the Hertfordshire cohort study. Age Ageing. 2014;43:653–60.View ArticlePubMedPubMed CentralGoogle Scholar
- Wadsworth M, Kuh D, Richards M, Hardy R. Cohort Profile: The 1946 National Birth Cohort (MRC National Survey of Health and Development). Int J Epidemiol. 2006;35:49–54.View ArticlePubMedGoogle Scholar
- Steptoe A, Breeze E, Banks J, Nazroo J. Cohort profile: the English longitudinal study of ageing. Int J Epidemiol. 2013;42:1640–8.View ArticlePubMedPubMed CentralGoogle Scholar
- Therneau TM, Grambsch PM. Modeling survival data: extending the Cox model. New York: Springer; 2000.View ArticleGoogle Scholar
- Castaneda J, Gerritse B. Appraisal of several methods to model time to multiple events per subject : modelling time to hospitalizations and death. Rev Colomb Estad. 2010;33:43–61.Google Scholar
- Syddall HE, Sayer AA, Dennison EM, Martin HJ, Barker DJ, Cooper C. Cohort profile: the Hertfordshire cohort study. Int J Epidemiol. 2005;34:1234–42.View ArticlePubMedGoogle Scholar
- Lyon D, Lancaster GA, Taylor S, Dowrick C, Chellaswamy H. Predicting the likelihood of emergency admission to hospital of older people: development and validation of the Emergency Admission Risk Likelihood Index (EARLI). Fam Pract. 2007;24:158–67.View ArticlePubMedGoogle Scholar
- Bottle A, Aylin P, Majeed A. Identifying patients at high risk of emergency hospital admissions: a logistic regression analysis. J R Soc Med. 2006;99:406–14.View ArticlePubMedPubMed CentralGoogle Scholar
- Billings J, Dixon J, Mijanovich T, Wennberg D. Case finding for patients at risk of readmission to hospital: development of algorithm to identify high risk patients. BMJ. 2006;333:327.View ArticlePubMedPubMed CentralGoogle Scholar
- 12 NHS Services Scotland. Scottish Patients at Risk of Readmission and Admission (SPARRA). A Report on the Development of SPARRA Version 3. 2011. www.isdscotland.org/Health-Topics/Health-and-Social-Community-Care/SPARRA/2012-02-09-SPARRA-Version-3.pdf. Accessed 25 Nov. 2015.
- Hosmer D, Lemeshow S, May S. Applied Survival Analysis, Regression modelling of Time-to-Event Data. New York: Wiley; 2008.View ArticleGoogle Scholar
- Villegas R, Julià O, Ocaña J. Empirical study of correlated survival times for recurrent events with proportional hazards margins and the effect of correlation and censoring. BMC Med Res Methodol. 2013;13:95.View ArticlePubMedPubMed CentralGoogle Scholar
- Hippisley-Cox J, Coupland C. Predicting risk of emergency admission to hospital using primary care data: derivation and validation of QAdmissions score. BMJ Open. 2013;3, e003482.View ArticlePubMedPubMed CentralGoogle Scholar
- Pandeya N, Purdie DM, Green A, Williams G. Repeated occurrence of basal cell carcinoma of the skin and multifailure survival analysis: follow-up data from the Nambour Skin Cancer Prevention Trial. Am J Epidemiol. 2005;161:748–54.View ArticlePubMedGoogle Scholar
- Guo Z, Gill TM, Allore HG. Modeling repeated time-to-event health conditions with discontinuous risk intervals: an example of a longitudinal study of functional disability among older persons. Methods Inf Med. 2009;47:107–16.Google Scholar
- Kennedy BS, Kasl SV, Vaccarino V. Repeated hospitalizations and self-rated health among the elderly: a multivariate failure time analysis. Am J Epidemiol. 2001;153:232–41.View ArticlePubMedGoogle Scholar
- Andersen PK, Gill RD. Cox’s Regression Model for Counting Processes: A Large Sample Study. Ann Stat. 1982;10:1100–20.View ArticleGoogle Scholar
- Prentice RL, Williams BJ, Peterson AV. On the regression analysis of multivariate failure time data. Biometrika. 1981;68:373–9.View ArticleGoogle Scholar
- 21 Cleves M. Analysis of multiple failure-time survival data. 2009. http://www.stata.com/support/faqs/statistics/multiple-failure-time-data/. Accessed 25 Nov. 2015.
- StataCorp. Stata Statistical Software: Release 13. College Station: StataCorp LP; 2013.Google Scholar
- Castro M, Carvalho M, Travassos C. Factors associated with readmission to a general hospital in Brazil. Cad Saúde Pública. 2005;21:1186–200.View ArticlePubMedGoogle Scholar
- Murphy TE, Han L, Allore HG, Peduzzi PN, Gill TM, Lin H. Treatment of death in the analysis of longitudinal studies of gerontological outcomes. J Gerontol A Biol Sci Med Sci. 2011;66:109–14.View ArticlePubMedGoogle Scholar
- Dignam JJ, Zhang Q, Kocherginsky M. The use and interpretation of competing risks regression models. Clin Cancer Res. 2012;18:2301–8.View ArticlePubMedPubMed CentralGoogle Scholar
- Meira-Machado L, Uña-Álvarez J, Cadarso-Suárez C, Andersen PK. Multi-state models for the analysis of time-to-event data. Stat Methods Med Researc. 2010;18:195–222.View ArticleGoogle Scholar
- Box-steffensmeier JM, Zorn C. Duration Models for Repeated Events. J Polit. 2002;64:1069–94.View ArticleGoogle Scholar
- Marmot M. Fair Society, Healthy Lives: A Strategic Review of Health Inequalities in England Post-2010. London: Department of Health; 2010.Google Scholar
- CSDH. Closing the gap in a generation: health equity through action on the social determinants of health. Final Report of the Commission on Social Determinants of Health. Geneva: World Health Organization; 2008.Google Scholar