Estimands to quantify prolonged hospital stay associated with nosocomial infections

Background Length of stay evaluations are very common to determine the burden of nosocomial infections. However, there exist fundamentally different methods to quantify the prolonged length of stay associated with nosocomial infections. Previous methodological studies emphasized the need to account for the timing of infection in order to differentiate the length of stay before and after the infection. Methods We derive four different approaches in a simple multi-state framework, display their mathematical relationships in a multiplicative as well as additive way and apply them to a real cohort study (n=756 German intensive-care unit patients of whom 124 patients acquired a nosocomial infection). Results The first approach ignores the timing of infection and quantifies the difference of eventually infected and eventually uninfected; it is 12.31 days in the real data. The second approach compares the average sojourn time with infection with the average sojourn time of being hypothetically uninfected; it is 2.12 days. The third one compares the average length of stay of a population in a world with nosocomial infections with a population in a hypothetical world without nosocomial infections; it is 0.35 days. Finally, approach four compares the mean residual length of stay between currently infected and uninfected patients on a daily basis; the difference is 1.77 days per infected patient. Conclusions The first approach should be avoided because it compares the eventually infected with the eventually uninfected, but has no prospective interpretation. The other approaches differ in their interpretation but are suitable because they explicitly distinguish between the pre- and post-time of the nosocomial infection.


Introduction
Length of stay (LOS) is one of the most important outcomes in clinical epidemiology since it is directly linked to patients' morbidity and economic costs [1]. It is easy to measure and often routinely collected in surveillance data bases. During the stay in hospital, patients are at risk to acquire nosocomial infections (NI) which belong to the major common adverse events in hospitals. Many observational reports have studied the impact of NI on length of stay by using different statistical methods. When evaluating the prolonged LOS of NI, the timing of NI plays an important role to distinguish between pre-infection time and consequence of NI. Several methodological papers *Correspondence: wolke@imbi.uni-freiburg.de Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center -University of Freiburg, Stefan-Meier 26, Freiburg, Germany showed the magnitude of the so-called time-dependent (aka immortal-time) bias which occurs if the timing of infection is not adequately addressed or rather ignored in the analysis [2,3]. Multi-state models or time-dependent matching techniques account for the timing of NI to avoid the time-dependent bias [2][3][4]. However, there exist fundamentally different estimands to quantify this prolonged LOS associated with NI. In this article, we describe four different approaches and estimands in a simple multi-state framework [5], display their mathematical relationships in a multiplicative as well as additive way and apply them to a real cohort study.

Methods
We consider a time-homogeneous multi-state model ( Fig. 1) with the three states 0=admission, 1=nosocomial infection, 2=discharge/death and assume constant hazard rates λ 01 , λ 02 and λ 12 between the corresponding states. The hazards λ ij are interpreted as the daily risk of moving from state i to state j.
The infection hazard λ 01 , also denoted as the incidence density of NI, is estimated by dividing the number of NI events by the number of summed patient-days in state 0 [5]. Analogously, the event densities λ 02 and λ 12 are estimated by dividing the discharge/death events by the number of summed patient-days in state 0 and 1, respectively [5]. These estimates are formally obtained via maximum likelihood estimation [6]. Since the hazard rates are assumed to be time-constant, the time to leave state 0 follows an exponential distribution with the hazard rate λ 01 + λ 02 . Thus, the average sojourn time in state 0 is 1 λ 01 +λ 02 . Analogously, the time to leave state 1 follows an exponential distribution with the hazard rate λ 12 leading to an average sojourn time in state 1 of 1 λ 12 . The probability to acquire a NI is equal to λ 01 λ 01 +λ 02 . We can write the average LOS in terms of the hazards from the multi-state model, which is the sum of the sojourn time in state 0 and the sojourn time in state 1 multiplied with the probability to acquire a NI: If we assume the common case that NI reduce the discharge hazard, i.e. λ 12 λ 02 < 1, we can use some algebra to derive following relationship: where 1 λ 02 is interpreted as the average LOS in a hypothetical world without NI. The first inequality is shown as 1 λ 0 2 < LOS ⇔ λ 12 (λ 01 + λ 02 ) < λ 02 (λ 12 + λ 01 ) ⇔ λ 12 λ 01 < λ 02 λ 01 ⇔ λ 12 < λ 02 . The second inequality is shown as LOS = (λ 12 + λ 01 )/((λ 01 + λ 02 )λ 12 ) < (λ 02 + λ 01 )/((λ 01 + λ 02 )λ 12 = 1/λ 12 These inequalities mean that the mean LOS in a world without NI is smaller than the mean LOS in a real world and this is smaller than the mean sojourn time in state 1. Based on the multi-state model, four different approaches to quantify the LOS associated with NI can be derived.

Restrospective stratification of eventually infected and uninfected
The most common approach (A1) is to compare the average overall LOS of eventually infected patients with the average overall LOS of eventually uninfected patients. It addresses the following medical question of interest (see Table 1): 'How many days do patients with NI stay, on average, eventually longer in hospital than patients who will never acquire a NI?' . In terms of the multistate model, the average overall LOS of eventually infected patients is the sum of the average sojourn times of state 0 and state 1, 1 λ 01 +λ 02 + 1 λ 12 . The average overall LOS of eventually uninfected patients is the sojourn time in state 0, 1 λ 01 +λ 02 . Thus, approach A 1 is mathematically expressed by = LOS difference of eventually infected and eventually uninfected which is the average sojourn time in state 1, given the patient has reached this state, i.e., has acquired a NI. In this approach, the classification of infected and uninfected is done retrospectively at the end of hospital stay.
This approach does not distinguish between pre-and post-infection LOS and thus does not allow a prospective associational (or even causal) interpretation. The limitation is that the LOS before NI which is included in the LOS of eventually infected patients can not be interpreted as LOS attributed due to NI. Instead, the LOS before NI (which is per-se not attributable to NI) should count as uninfected LOS. Therefore, the following approaches have been developed.

Differentiating between pre-and post-infection length of stay
In contrast to the previous approach, the following approaches will differentiate between the pre-infection time and consequence of NI in terms of LOS.
The second approach A 2 , termed as attributable LOS [7,8], compares the average sojourn time in state 1 1 λ 12 with the average sojourn time in state 0 in a hypothetical world without NI 1 λ 02 . The medical question of interest (see Table 1) is 'How many hospital days, on average, would a patient have stayed shorter if he/she would not have acquired a NI?' . This is quantified by In contrast to approach A 1 , the left part of the difference in approach A 2 1 λ 12 considers only the post-infection LOS of infected patients. Moreover, the right part 1 λ 02 considers a longer LOS than the one of approach A 1 as 1 λ 02 > 1 λ 01 +λ 02 . This corrects for the limitations of approach A 1 . However, the limitation of approach A 2 is that 1 λ 02 is not a real world mean time and is therefore a hypothetical quantity for LOS of uninfected patients.
In the third approach A 3 , we substract the average LOS in a hypothetical world from the one in a real world addressing the medical question 'How many hospital days would the average length of stay be shorter if all NI in the population would be eliminated?' . Algebraically, it is This estimand is called the population-attributable LOS [8,9], which is a population measure of extra LOS and compares the average LOS of a population in a world with NI with a population in a hypothetical world without NI.
In the fourth approach A 4 , we subtract the average length of stay from the sojourn time in state 1 which aims to answer the medical question 'How many days does a patient with NI stay, on average, longer in hospital?' (see Table 1). It is currently uninfected This estimand, also called the change of length of stay, is the established multi-state approach [10] and compares mean residual LOS between currently infected and uninfected patients using landmarking on each day in the hospital, it is a difference per infected patient.
Basic properties related to the hazard ratio λ 12 λ 02 In this section we consider basic relationships to the hazard ratio λ 12 λ 02 . The hazard ratio λ 12 λ 02 is often calculated and it describes in a multiplicative way if and how NI prolongs LOS. A hazard ratio of 1 means that the daily chance to be discharged does not change if the patient acquires a NI meaning that NI does not prolong the LOS. It is more often the case that the hazard ratio is smaller than 1 indicating a prolonged LOS associated with NI. It is rarely the case that the hazard ratio is greater than 1 which would mean a shortened LOS associated with NI. It is obvious that approach A 1 is always larger than 0 (A 1 > 0) as λ 12 > 0. Since it further does not depend on λ 02 , A 1 always means that NI patients stay eventually longer than patients who never acquired NI, even if λ 12 λ 02 = 1 or λ 12 λ 02 > 1 which is not a required property. For the other approaches we have: It is also easily shown that λ 12 Thus, approaches A 2 , A 3 and A 4 have the required mathematically equivalent properties regarding the direction of the hazard ratio λ 12 λ 02 whereas approach A 1 does not. In Table 2, the properties of all approaches are displayed, summarized and contrasted to each other.

Additive and multiplicative comparisons of approaches
Before we compare the approaches in a additive and multiplicative way, we note that approaches A 1 and A 2 do not depend on infection hazard λ 01 whereas approaches A 3 and A 4 do. Further, approaches A 1 , A 2 and A 4 are at the patient-individual level and therefore directly comparable whereas A 3 is at the population-level. All approaches are displayed in the Table 3. We further note that there is also following relationship:

days
Additive relationships between approaches (differences) 10.54 days

Comparing approaches A 1 and A 2
The additive relationship between approaches A 1 and A 2 is just the average length of stay of a population in a hypothetical world without NI 1 λ 02 . The multiplicative relationship is A 1 A 2 = λ 02 λ 02 −λ 12 .

Comparing approaches A 2 and A 3
Approaches A 3 and A 2 are best compared in a multiplicative way:

Comparing approaches A 3 and A 4
In contrast to the previous comparisons to approach A 1 , approaches A 3 and A 4 are best compared in a multiplicative way. There is the following simple relationship: A 4 = λ 01 λ 02 = odds(NI). Thus, the factor odds(NI) links the population-level approach A 3 to the individual-level approach A 4 . The additive relationship is rather complex: λ 12 (λ 01 +λ 02 ) .

Real data example
We use publicly available data from the R-package etm [11]. This is an observational prospectively collected cohort study including 756 intensive care patients from Germany of whom 124 patients acquired a nosocomial pneumonia (NI) during their stay in the intensive care unit (ICU). The data used here is a random sample from a larger cohort which is described in detail elsewhere [12].

Results
The

Discussion
A multi-state model was used to mathematically derive four fundamentally different approaches to quantify the prolonged length of stay associated with nosocomial infections or other adverse events [13]. The relationships were displayed in an additive as well as a multiplicative way.
As in previous articles [2,4], we encourage researchers to not retrospectively stratify by infection status and, consequently, to avoid the use of approach A 1 because it does not differentiate between pre-and post-infection time and thus does not allow a causal interpretation.
The other approaches are suitable because they implicitly distinguish between the pre-infection time (which might be a risk factor) and post-infection time (which might be a consequence) of nosocomial infections. The main difference is the interpretation and we showed mathematical formulas how they are linked to each other. Thus, this knowledge can be used to better understand apparent discrepancies in the literature and transfer published values from one approach to the other.
The question whether nosocomial infections prolong hospital stay is -from the methodological point of viewrelated to 'life years lost among patients with a given disease' [14] by replacing discharge with death and length of stay with age. Andersen [14] considered also a multistate model, the classical illness-death model, in order to study different statistical variants and extensions of our approach A 4 including time-inhomogeneous Markov models, censoring and semi-Markov models. Approaches A 1 -A 3 are not considered by Andersen and complement his considerations.
This study has following limitations. First, we focused on the basic approaches and did not consider any other covariates such as characteristics from the patient-, hospital-or even country-level (for instance, as in Stewardson et al. [15]). Even though there exists regression models [16] which allows for adjusting the change of length of stay (approach A 4 ), we believe that the choice of the fundamental approach has a much stronger impact on the results than the adjustment for covariates. For instance, previous studies indicated that the timedependent bias could not be redeemed by adjustment of several patient-level covariates [4]. Second, the hazard rates are often not time-homogeneous in real-data settings. Even though time-inhomogeneous approaches exist, we are convinced that this simplification is required to provide a clear transparency which leads to a better understanding of basic distinctions. Third, we combined the diametrically opposed endpoints discharge (alive) and death. We think that this combination is reasonable if the focus is on length of stay and their related economic costs, the topic of this paper. However, as rapid death results in shorter LOS, a length-of-stay analysis should always be accompanied with an analysis with respect to mortality. This can be done by using an extended multi-state model that distinguishes between inpatient death and discharge alive [5,6].

Conclusion
We conclude that a clear distinction between different estimands is needed to better understand apparently large discrepancies in the literature. We recommend the use of approaches which differentiates between pre-and postinfection time.