This article has Open Peer Review reports available.

# Empirical study of correlated survival times for recurrent events with proportional hazards margins and the effect of correlation and censoring

- Rodrigo Villegas
^{1}Email author, - Olga Julià
^{2}and - Jordi Ocaña
^{3}

**13**:95

https://doi.org/10.1186/1471-2288-13-95

© Villegas et al.; licensee BioMed Central Ltd. 2013

**Received: **25 June 2012

**Accepted: **29 May 2013

**Published: **24 July 2013

## Abstract

### Background

In longitudinal studies where subjects experience recurrent incidents over a period of time, such as respiratory infections, fever or diarrhea, statistical methods are required to take into account the within-subject correlation.

### Methods

For repeated events data with censored failure, the independent increment (AG), marginal (WLW) and conditional (PWP) models are three multiple failure models that generalize Cox’s proportional hazard model. In this paper, we revise the efficiency, accuracy and robustness of all three models under simulated scenarios with varying degrees of within-subject correlation, censoring levels, maximum number of possible recurrences and sample size. We also study the methods performance on a real dataset from a cohort study with bronchial obstruction.

### Results

We find substantial differences between methods and there is not an optimal method. AG and PWP seem to be preferable to WLW for low correlation levels but the situation reverts for high correlations.

### Conclusions

All methods are stable in front of censoring, worsen with increasing recurrence levels and share a bias problem which, among other consequences, makes asymptotic normal confidence intervals not fully reliable, although they are well developed theoretically.

## Keywords

## Background

Cox’s proportional hazards model [1] is the most commonly used model for clinical trial data and provides reliable estimates of survival times, as well as the relative risk associated with time-to-event occurrence. As a semi-parametric model, it does not have any constraints on distributional assumptions, which makes it an attractive alternative to parametric models. However, it is not suitable for recurrent event data because survival times in the standard model terminate at the time of the event. In addition, trying to use multiple failure times is inappropriate since events within individuals may be correlated. In this case, the assumption of independence in the standard Cox regression model is violated, introducing statistical complications. To avoid the errors resulting from analyzing correlated repeated events, sometimes repeated events are disregarded and time to first event is only used, which is clearly a suboptimal approach. Multiple-event analytical techniques for survival data have been developed during the last decades [2] but, as a result of their complex structure and computational requirements, they have not been commonly applied. Advancement in statistical software in recent years has made these methods more accessible to researchers. As a consequence, their use has gradually increased in areas such as clinical research. Text books like Therneau and Grambsch, [2], describe these models in detail with examples, while research articles, such as that of Kelly and Lim [3], provide a very comprehensive explanation of their application to multiple-failure data, with extensive discussion about assumptions. There are three commonly used statistical methods for the analysis of recurrent events: Anderson-Gill model [4], Prentice-William-Peterson model [5] and Wei-Lin and Weissfeld model [6]. However, the benefits and drawbacks of each method are still unclear to the medical researchers, specially under varying strengths of correlation between survival times. The aim of this work is to study, through simulations and a real data set, the differences between the three models under presence of low to moderate correlation in survival times. With respect to similar studies like Kelly and Lim [3], we very approximately control the within individual correlation between consecutive recurrent events and we allow decreasing correlations as long as distance between recurrent events increases. In addition our simulation design takes care of different censoring levels and different numbers of recurrences, which are not fixed at 4 events as in [3]. We briefly describe the models, discuss in more detail the simulation procedure, and discuss the results of fitting the simulated data for the different models. In the Example section, we provide an illustration of the models performance using real data from a study in infants with respiratory illness recurrences.

## Methods

### Basic notations

*i*

^{ t h }individual by

where *t* represents time, *Z*
_{
i
} is the vector of covariates of the *i*
^{
t
h
} individual, *β* is a vector of regression coefficients and *h*
_{0}(*t*) is the so-called baseline hazard function that corresponds to the hazard function for an individual with covariables set to zero. As model (1) is formulated through the hazard function, the simulation of appropriate survival times for this model is not straightforward, the effect of the covariates must be translated from the hazards to the survival time. We need to simulate the survival times since software packages for Cox models require the individual survival time data, not the hazard function. Suppose that there are *n* individuals and that each individual can experience *K* failures. Times of failure are subject to right censoring and we consider that censoring is non informative, which means that knowledge of a censoring time for an individual or subject provides no further information about the individual’s probability of survival at a future time. Let *T*
_{
i
j
} be the time when the *j*
^{
t
h
} failure occurs for the *i*
^{
t
h
} subject, measured from the subject’s study enrollment, and *C*
_{
i
} be the corresponding censoring time for individual *i*. Since the *j*
^{
t
h
} failure time *T*
_{
i
j
} is affected by right censoring we define *X*
_{
i
j
} as the minimum of (*T*
_{
i
j
},*C*
_{
i
}), i.e., *X*
_{
i
j
} equals *T*
_{
i
j
} if the event was observed, and *C*
_{
i
} if it is censored. Let *δ*
_{
i
j
}=*I*(*T*
_{
i
j
}≤*C*
_{
i
}) be the indicator function taking the value 1 when *T*
_{
i
j
}≤*C*
_{
i
} and 0 otherwise. For simplicity, in our simulations all censoring times are taken as constant, i.e. *C*
_{
i
}=*C* for all *i*.

### Overview of the multiple failure time models

For multiple events the common approaches that generalize the Cox’s framework are the independent increment (AG), marginal (WLW) and conditional (PWP) models. These three models differ essentially in the risk sets.

#### Andersen and Gill (AG) independent increment model

*i*

^{ t h }subject is assumed to follow a proportional hazard model where the hazard function is

Under this model, the risk of a recurrent event for a subject follows the usual proportional hazards assumption but the rank of recurrence is not taken into account. By defining the risk set indicator *Y*
_{
i
j
}(*t*)=*I*(*X*
_{
i,j−1}<*t*<*X*
_{
i
j
}), the risk set at time *t*, represented by $\sum _{\mathit{\text{ij}}}{Y}_{\mathit{\text{ij}}}\left(t\right)$, includes all subjects under observation regardless of the number of occurrences experienced by each subject. As *j* increases, fewer individuals are likely to experience the event, but this fact does not affect the risk set due to the fact that it does not depend on *j*; so the coefficients estimates are stable. The AG model provides more efficient inference for a covariate effect than the plain Cox model for the time to the first event, but requires much stronger assumptions than the other models, such as independence among recurrent event times or common baseline hazards. Also, a robust “sandwich” method can be used in the estimation of standard errors [7].

#### Prentice, Williams and Peterson (PWP) conditional model

*j*

^{ t h }event until he/she has experienced event

*j*−1. This model requires the same assumptions as the Cox model, but, unlike the AG model, it allows the baseline hazard to vary from recurrence to recurrence, the hazard function for the

*j*

^{ t h }event for the

*i*

^{ t h }subject is:

The risk set indicator is the same as in the AG model, *Y*
_{
i
j
}(*t*)=*I*(*X*
_{
i,j−1}<*t*<*X*
_{
i
j
}), but the risk set at time *t* is different for each *j*: $\sum _{i,}{Y}_{\mathit{\text{ij}}}\left(t\right)$. According to the definition of the risk set, the number of subjects is dramatically decreased as *j* increases. Stable coefficient estimates cannot be obtained for higher ranks of *j*. The hazard function at time *t* for the *j*
^{
t
h
} recurrence is conditional to the entire previous failures. This model allows different baseline hazards, therefore, estimations for the current event may be affected by earlier events.

#### Wei, Lin, and Weissfeld (WLW) marginal model

*j*, the distribution of time from the study enrollment to the

*j*

^{ t h }recurrent event according to a Cox model, leaving the dependence structure between the observations within the same subject completely unspecified. For the WLW model the risk set indicator is

*Y*

_{ i j }(

*t*)=

*I*(

*X*

_{ i,j }≥

*t*), and the risk set at time

*t*for the

*j*

^{ t h }recurrence, represented by $\sum _{i,}{Y}_{\mathit{\text{ij}}}\left(t\right)$, includes anyone who has not experienced the

*j*

^{ t h }recurrence at time

*t*. In other words, unlike the PWP model, a subject in the risk set at time

*t*does not necessarily have to experience the (

*j*−1)

^{ t h }recurrence. The hazard function of the marginal model for the

*j*

^{ t h }event for the

*i*

^{ t h }subject is

The WLW approach obtains the estimate of the covariate effects from the partial likelihood function under the working assumption of independence and then adjusts the variance estimate empirically. Although observations on each subjects are correlated, the *β* estimation has been shown to be consistent in the case of two events [8]. As the inverse of the information matrix provides a naive variance-covariance estimation, not correctly estimating the true variance, a robust “sandwich” variance-covariance estimation method is preferred.

In all these models we can add a frailty, that is, a random individual effect to explain the dependence of individual recurrent events. In our simulation study we try to precisely control the degree of correlation between adjacent recurrent event times. Frailty makes difficult such a control, so we have not considered it.

### Simulation

*n*=(50,100,200,400). Survival times were censored at fixed time determined to yield censoring percentages

*p*of 0%, 15%, 30% and 50%. In other words, previously fixed censoring times

*C*

_{ p }= (

*C*

_{0}= +

*∞*,

*C*

_{15},

*C*

_{30},

*C*

_{50}) were established in order to ensure the preceding proportions of subjects experiencing a censure. The maximum number of recurrent events under consideration were

*K*=(3,6,9,12). For subject

*i*, the recurrence times

*t*

_{ i1},

*t*

_{ i2}, …,

*t*

_{ i K }were generated as correlated Weibull deviates as is explained below. Censoring occurred in subject

*i*if

*t*

_{ i1}+

*t*

_{ i2}+…+

*t*

_{ i K }>

*C*

_{ p }. In this case he/she experienced

*j*events,

*j*<

*K*, if

*T*

_{ i j }=

*t*

_{ i1}+…+

*t*

_{ i j }≤

*C*

_{ p }and

*t*

_{ i1}+…+

*t*

_{ i(j+1)}>

*C*

_{ p }. Note that not always censoring occurred only in the

*K*

^{ t h }event. The correlation levels between adjacent recurrence times were set at

*ρ*=(0,0.10,0.45,0.80). The 320 simulated scenarios were defined by crossing all levels of these four factors. Each simulation series consisted of the generation of

*m*= 10000 random samples, each one representing a survival dataset where half of individuals were considered as “control” and half of individuals were considered as “treatment”. This covariate was represented by a binary variable with values 0 and 1. The treatment effect was set at

*β*=−1. For each simulated dataset we adjusted the WLW, AG and PWP models. We used the following measures of the accuracy of the regression parameter estimates and the gain of robustness from the robust approaches: the true coverage of the 95% confidence interval for the parameter

*β*, the relative bias of the

*β*estimators under the three models and the bias of the “robust” and “naive” variance estimators for all three models. The relative sampling bias is defined as the average bias from the

*m*random samples. That is, if ${\widehat{\beta}}_{i}$ is the estimate on

*i*

^{ t h }simulated dataset, then

*U*be a random variable with uniform distribution. We can generate a new random variable

*Y*with distribution function

*F*by using the inverse (or pseudo inverse) function

*F*

^{−1}[10] as follows

*c*and

*κ*, that is, with distribution function

*F*(

*y*)=1−

*e*

*x*

*p*(−

*c*

*y*

^{ κ }), for

*c*>0,

*κ*>0. Note that, usually, the Weibull parameters are

*λ*and

*κ*with

*c*=

*λ*

^{ κ }. Therefore, given $U\backsim $ Uniform [0,1],

has the aforementioned Weibull distribution. The hazard function corresponding to *Y* is given by *h*(*y*)=*c* *κ* *y*
^{
κ−1}. Under the Cox model and for a binary covariate *X*, the parameter *c* would be equal to some base value *c*
_{0} when *X*=0 and would become equal to *c*
_{1}=*c*
_{0} exp(*β*) when *X*=1.

#### Simulation of within-subject dependence

*ρ*between successive recurrence events in the same individual and with decreasing correlation as the distance between events increases. To incorporate this dependence into our simulated data while still maintaining the marginal Weibull distribution for each observation we will use a transformation of a convenient multivariate normal distribution. We describe this procedure in three steps:

- 1.
For any correlation

*ρ*, choose an appropriate*ρ*_{0}following formula (9) derived in the next section. For each subject,*i*,*i*=1,…,*n*, generate a random variate*Z*_{ i1}from a*N*(0,1) distribution. Given*z*_{ i1}, generate a random variate*z*_{ i2}from a normal distribution with mean*ρ*_{0}*z*_{ i1}and variance $1-{\rho}_{0}^{2}$. Given*z*_{ i2}, generate a random variate*z*_{ i3}from a $N({\rho}_{0}{z}_{i2},1-{\rho}_{0}^{2})$; and so on, until*z*_{ i K }. This ensures that each consecutive pair (*z*_{ i j },*z*_{ i,j+1}) is a realization of (*Z*_{ i j },*Z*_{ i,j+1}) following a bivariate normal distribution with standard marginals,*N*(0,1) and correlation*ρ*_{0}. Also the correlation between*Z*_{ i j }and*Z*_{ i,j+τ }decreases as*τ*increases. - 2.
Transform

*Z*_{ i j }into uniform random variables using the standard normal cumulative distribution function*Φ*:*U*_{ i j }=*Φ*(*Z*_{ i j }). Variables*U*_{ i j }have a similar correlation structure. - 3.
Finally, transform the variables

*U*_{ i j }into dependent Weibulls using Equation (2). These Weibulls have again a similar correlation structure, with correlation*ρ*between adjacent recurrence times.

#### Computation of the correlation *ρ*
_{0}

*Z*

_{ i j }, the simulated log failure times may be expressed as

*g*

_{ i j }=

*G*(

*Z*

_{ i j }), where

*g*

_{ i j }and

*g*

_{ i,j+1}about

*z*=0, it can be shown (see Appendix) that

*C*

*o*

*r*

*r*(

*g*

_{ i j },

*g*

_{ i,j+1}) is a weighted average of

*ρ*

_{0}and ${\rho}_{0}^{2}$; namely,

*w*does not depend on the parameters

*κ*or

*c*; specifically,

*ρ*

_{0}such that

*C*

*o*

*r*

*r*(

*g*

_{ i j },

*g*

_{ i,j+1})=

*ρ*, the solution is

#### Software simulation

All the simulations were performed using the library *survival* and the functions *qnorm* and *runif* from the statistical software R2.15.2, 64 bit version, [11]. The default generator is based on a Mersenne-Twister algorithm.

## Results

**Relative bias, naive variance, robust variance and coverage robust for 15% censoring and sample size of 200**

| Events | Model | Relative bias | Naive var | Robust var | Coverage |
---|---|---|---|---|---|---|

0.1 | 3 | AG | -0.193 | 0.008 | 0.006 | 0.308 |

PWP | -0.030 | 0.010 | 0.011 | 0.930 | ||

WLW | 0.320 | 0.009 | 0.021 | 0.415 | ||

6 | AG | -0.254 | 0.004 | 0.003 | 0.003 | |

PWP | -0.042 | 0.006 | 0.007 | 0.897 | ||

WLW | 0.666 | 0.005 | 0.021 | 0.004 | ||

9 | AG | -0.278 | 0.003 | 0.002 | 0.000 | |

PWP | -0.048 | 0.005 | 0.006 | 0.871 | ||

WLW | 0.929 | 0.004 | 0.022 | 0.000 | ||

12 | AG | -0.291 | 0.002 | 0.001 | 0.000 | |

PWP | -0.050 | 0.004 | 0.005 | 0.861 | ||

WLW | 1.151 | 0.003 | 0.022 | 0.000 | ||

0.45 | 3 | AG | -0.296 | 0.008 | 0.007 | 0.069 |

PWP | -0.201 | 0.009 | 0.011 | 0.491 | ||

WLW | 0.166 | 0.009 | 0.022 | 0.806 | ||

6 | AG | -0.343 | 0.004 | 0.004 | 0.001 | |

PWP | -0.291 | 0.005 | 0.006 | 0.055 | ||

WLW | 0.358 | 0.005 | 0.022 | 0.318 | ||

9 | AG | -0.350 | 0.003 | 0.003 | 0.000 | |

PWP | -0.324 | 0.004 | 0.004 | 0.008 | ||

WLW | 0.514 | 0.003 | 0.021 | 0.058 | ||

12 | AG | -0.349 | 0.002 | 0.002 | 0.000 | |

PWP | -0.338 | 0.003 | 0.003 | 0.002 | ||

WLW | 0.657 | 0.003 | 0.022 | 0.005 | ||

0.8 | 3 | AG | -0.381 | 0.008 | 0.008 | 0.016 |

PWP | -0.339 | 0.009 | 0.011 | 0.124 | ||

WLW | 0.051 | 0.009 | 0.024 | 0.936 | ||

6 | AG | -0.448 | 0.004 | 0.005 | 0.000 | |

PWP | -0.509 | 0.005 | 0.005 | 0.000 | ||

WLW | 0.110 | 0.004 | 0.023 | 0.892 | ||

9 | AG | -0.462 | 0.003 | 0.004 | 0.000 | |

PWP | -0.574 | 0.003 | 0.003 | 0.000 | ||

WLW | 0.157 | 0.003 | 0.023 | 0.829 | ||

12 | AG | -0.462 | 0.002 | 0.004 | 0.000 | |

PWP | -0.605 | 0.002 | 0.003 | 0.000 | ||

WLW | 0.205 | 0.002 | 0.022 | 0.727 |

If only the first event is considered, the relative bias, efficiency and coverage of the standard Cox proportional hazard estimator are adequate, as expected, in all the investigated scenarios. We do not report these results because our interest is focused in recurrent data.

Censoring has a very small influence on the performance of the treatment effect estimators, $\widehat{\beta}$, for all models under consideration. The graphics are nearly identical for all levels of censoring so only 15% level is reported.

*ρ*=0.1 and

*ρ*=0.8) and each row to a maximum number of possible recurrences (

*K*=3 and

*K*=9).

With respect to relative bias, the AG estimator of *β* is stable for all different sample sizes. The bias grows with the correlation and the number of possible recurrences. The treatment effect is always underestimated. In all circumstances the naive variance estimator is worse than the robust estimator, but both perform very similarly and adequately. The true coverage of the confidence intervals is in general low and decreases with correlation, number of events and, surprisingly, with sample size.

The estimator of *β* based on model PWP is nearly unbiased for low correlations and stable with respect to the number of recurrences. When the correlation is high, it underestimates the treatment effect *β*. This bias does not improve with growing sample sizes. Again, the robust variance is better than the naive variance, but both are very similar and perform adequately, improving with growing sample sizes and correlations. For low correlation levels its coverage is the best, but it quickly falls with increasing correlations and becomes sensitive to growing sample sizes, in the same surprising direction as the AG method.

Under the WLW model *β* is overestimated. This bias increases with growing recurrence but improves with growing correlation. The naive variance estimator under WLW is very biased, its use should be avoided. The robust variance estimator is clearly better but slightly worse than the other robust estimators for sample sizes below 200. Under low correlation, the confidence interval performs very inadequately, with very low coverages which decrease with growing sample sizes and with growing number of recurrences. On the other hand, its coverage greatly improves with growing correlations, though a tendency to make worse with growing sample sizes and recurrence levels still persists.

### Example

**Number of episodes of illness**

Number of events | ||||||||
---|---|---|---|---|---|---|---|---|

0 | 1 | 2 | 3 | 4 | 5 | 6-10 | Total | |

N. of subjects (%) | 260 (54.7) | 101 (21.3) | 53 (11.2) | 21 (4.4) | 21 (4.4) | 10 (2.1) | 9 (1.9) | 475 (100) |

*β*values estimated by the AG and PWP models are lower than those estimated by the WLW model for both covariates, smoking and asthma. But all of them show a negative effect in the outcome. When we analyze the standard errors, the robust standard errors are larger in all models than those estimated by the naive method. And again, the WLW model has the biggest standard error in comparison to the other models. These results are coherent with the previous simulation results. Note that the WLW is the only model where asthma is a statistically significant.

**Coefficients and standard errors obtained in two univariate analysis for 2 covariates: parents reported smoking and family history of asthma**

Model | Covariate | Beta | Naive S.E. | Robust S.E. | Robust 95% CI | |
---|---|---|---|---|---|---|

AG | Smoking | 0.158 | 0.059 | 0.089 | -0.016, 0.332 | |

Asthma | 0.276 | 0.111 | 0.155 | -0.028, 0.579 | ||

PWP | Smoking | 0.147 | 0.060 | 0.088 | -0.025, 0.319 | |

Asthma | 0.251 | 0.113 | 0.148 | -0.039, 0.540 | ||

WLW | Smoking | 0.162 | 0.058 | 0.102 | -0.037, 0.362 | |

Asthma | 0.407 | 0.111 | 0.189 | 0.036, 0.778 |

## Discussion

The simulation results show that each model has different degrees of robustness against variations in the number of recurrent events and correlation. In terms of relative bias, the WLW model shows the worst performance for low correlations but greatly improves for high correlation levels, independently of censoring and sample size. The situation is reversed for the AG and PWP models, which perform better than WLW for low correlation levels. However, these models clearly underestimate the parameter. As some authors have suggested [3], the WLW model overestimates regression coefficients due to a carry-over effect. This is corroborated by our simulation results. Additionally, as we can see from Table 3 in the example, the WLW model presents the highest coefficients and robust standard errors.

The most surprising results refer to the unexpected behavior of all methods with growing sample sizes. First, their bias remains nearly constant. This was so surprising that we repeated the computations with Stata, with identical results to those obtained with the “coxph” R function. Provided that (as is expected) the true variance of all estimators decreases and its robust estimation greatly improves with growing sample sizes (Figure 2), as a consequence the confidence intervals become more and more narrow around a biased value, which produces a disconcerting, at first sight, fall of coverage (Figure 3).

The AG, PWP and WLW models have been previously compared by various authors (e.g., [3, 7, 13, 14]) using real and simulated data, showing that these models often yield different results for the same data set. This is not unexpected, since distinct models address different research questions. For example, in [3] they found that the WLW model overestimates the treatment effect. Also they show that as the correlation increases the coverage of the 95% confidence interval based on the robust variance decreases, except for WLW.

## Conclusion

In this paper we reviewed and compared the principal methods in the analysis of recurrent event data in terms of censoring and within-subject correlation. The example illustrates the differences between the methods and resulting parameter estimators.

No method can be recommended as the best in all circumstances. The AG and PWP approaches are quite robust in front of low levels of within-subject correlation, but the WLW approach should be recommended under the suspicion of high correlation. All methods share a bias problem which makes the confidence intervals based on asymptotic normal theory not fully accurate. On the other hand, the robust estimation of the variance of the treatment effect estimator is very reliable, and should be recommended instead of the naive variance estimation, especially for the WLW method.

## Appendix

*ρ*(

*Z*

_{ i j },

*Z*

_{ i k })=

*ρ*

_{0}>0 and

*ρ*(

*g*

_{ i j },

*g*

_{ i,j+1})=

*ρ*>0 with

*g*

_{ i j }=

*G*(

*Z*

_{ i j }) where

*G*is defined in (3). To find the correlation between

*g*

_{ i j }and

*g*

_{ i,j+1}we will use a second order Taylor expansion (delta method, [15]) for the function

*G*about

*z*=0 as follows:

*Z*

_{ i j }∼

*N*(0,1), we have

*E*(

*Z*

_{ i j })=0 and $\mathit{\text{Var}}\left({Z}_{\mathit{\text{ij}}}\right)=E\left({Z}_{\mathit{\text{ij}}}^{2}\right)=1$, then,

*E*(·) operator taking into account that

*E*[

*Z*

_{ i j }

*Z*

_{ i k }]=

*ρ*

_{0}and $E\left[{Z}_{\mathit{\text{ij}}}\right({Z}_{\mathit{\text{ik}}}^{2}-1\left)\right]=E\left[{Z}_{\mathit{\text{ik}}}{Z}_{\mathit{\text{ij}}}^{2}\right]=0$, we have:

*V*

*a*

*r*(

*g*

_{ i j }) we proceed in the same way as before,

*g*

_{ i j }and

*g*

_{ i,j+1}is

*w*, we use that:

Note that *w* does not depend neither on *k* nor on *c*.

## Declarations

### Acknowledgements

This work was partially supported by Grant MTM2008-06747-C02-01 and MTM08-00642/CUA from the Ministerio de Educación y Ciencia. The authors thank the GRASS group (http://grass.upc.edu) for helpful discussions. We thank the referees for their accurate, rigorous and constructive comments.

## Authors’ Affiliations

## References

- Cox DR: Regression models and life-tables (with discussion). J R Stat Soc Ser B. 1972, 34 (2): 187-220.Google Scholar
- Therneau TM, Grambsch PM: Modeling Survival Data: Extending the Cox Model. 2000, New York: SpringerView ArticleGoogle Scholar
- Kelly PJ, Lim LL: Survival analysis for recurrent event data: an application to childhood infectious diseases. Stat Med. 2000, 19: 13-33. 10.1002/(SICI)1097-0258(20000115)19:1<13::AID-SIM279>3.0.CO;2-5.View ArticlePubMedGoogle Scholar
- Anderson PK, Gill RD: Cox’s regression model for counting processes: a large sample study. Ann Stat. 1982, 10: 1100-1120. 10.1214/aos/1176345976.View ArticleGoogle Scholar
- Prentice RL, Williams BJ, Peterson AV: On the regression analysis of multivariate failure time data. Biometrika. 1981, 68: 373-379. 10.1093/biomet/68.2.373.View ArticleGoogle Scholar
- Wei LJ, Lin DY, Weissfeld L: Regression analysis of multivariate incomplete failure time data by modeling marginal distributions. J Am Stat Assoc. 1989, 84 (408): 1065-1073. 10.1080/01621459.1989.10478873.View ArticleGoogle Scholar
- Therneau TM, Hamilton SA: rhDNase as an example of recurrent event analysis. Stat Med. 1997, 16: 2029-2047. 10.1002/(SICI)1097-0258(19970930)16:18<2029::AID-SIM637>3.0.CO;2-H.View ArticlePubMedGoogle Scholar
- Lin DY: Cox regression analysis of multivariate failure time data. Stat Med. 1994, 13: 2233-2247. 10.1002/sim.4780132105.View ArticlePubMedGoogle Scholar
- Feiveson AH: Power by simulation. Stata J. 2002, 2: 107-124.Google Scholar
- Hoel P, Port S, Stone C: Introduction to Probability Theory. 1971, Boston: Houghton Mifflin CompanyGoogle Scholar
- R Core Team: R: A Language and Environment for Statistical Computing. 2013, Vienna: R Foundation for Statistical Computing, [http://www.R-project.org/]. [ISBN 3-900051-07-0]Google Scholar
- Pino P, Walter T, Oyarzun M, Villegas R, Romieu I: Fine particulate matter and wheezing illnesses in the first year of life. Epidemiology. 2004, 15 (6): 702-708. 10.1097/01.ede.0000142153.28496.d0.View ArticlePubMedGoogle Scholar
- Gao S, Zhou XH: An empirical comparison of two semi-parametric approaches for the estimation of covariate effects from multivariate failure time data. Stat Med. 1997, 16: 2049-2062. 10.1002/(SICI)1097-0258(19970930)16:18<2049::AID-SIM641>3.0.CO;2-W.View ArticlePubMedGoogle Scholar
- Wei LJ, Lin DY, V G D: An overview of statistical methods for multiple failure time data in clinical trials. Stat Med. 1997, 16: 833-839. 10.1002/(SICI)1097-0258(19970430)16:8<833::AID-SIM538>3.0.CO;2-2.View ArticlePubMedGoogle Scholar
- Rice J: Mathematical Statistics and, Data Analysis. 1995, Belmont: Duxbury PressGoogle Scholar
- The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1471-2288/13/95/prepub

### Pre-publication history

## Copyright

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.