Estimating time-to-onset of adverse drug reactions from spontaneous reporting databases
- Fanny Leroy^{1, 2}Email author,
- Jean-Yves Dauxois^{3},
- Hélène Théophile^{4, 5},
- Françoise Haramburu^{4, 5} and
- Pascale Tubert-Bitter^{1, 2}
https://doi.org/10.1186/1471-2288-14-17
© Leroy et al.; licensee BioMed Central Ltd. 2014
Received: 10 October 2013
Accepted: 22 January 2014
Published: 3 February 2014
Abstract
Background
Analyzing time-to-onset of adverse drug reactions from treatment exposure contributes to meeting pharmacovigilance objectives, i.e. identification and prevention. Post-marketing data are available from reporting systems. Times-to-onset from such databases are right-truncated because some patients who were exposed to the drug and who will eventually develop the adverse drug reaction may do it after the time of analysis and thus are not included in the data. Acknowledgment of the developments adapted to right-truncated data is not widespread and these methods have never been used in pharmacovigilance. We assess the use of appropriate methods as well as the consequences of not taking right truncation into account (naive approach) on parametric maximum likelihood estimation of time-to-onset distribution.
Methods
Both approaches, naive or taking right truncation into account, were compared with a simulation study. We used twelve scenarios for the exponential distribution and twenty-four for the Weibull and log-logistic distributions. These scenarios are defined by a set of parameters: the parameters of the time-to-onset distribution, the probability of this distribution falling within an observable values interval and the sample size. An application to reported lymphoma after anti TNF- α treatment from the French pharmacovigilance is presented.
Results
The simulation study shows that the bias and the mean squared error might in some instances be unacceptably large when right truncation is not considered while the truncation-based estimator shows always better and often satisfactory performances and the gap may be large. For the real dataset, the estimated expected time-to-onset leads to a minimum difference of 58 weeks between both approaches, which is not negligible. This difference is obtained for the Weibull model, under which the estimated probability of this distribution falling within an observable values interval is not far from 1.
Conclusions
It is necessary to take right truncation into account for estimating time-to-onset of adverse drug reactions from spontaneous reporting databases.
Keywords
Pharmacovigilance Reporting databases Right truncation Parametric estimation Maximum likelihood estimation Bias Simulation studyBackground
Identifying and preventing adverse drug reactions are major objectives of pharmacovigilance. Owing to design constraints, pre-marketing clinical trials fail to identify rare events, which lead in the last decades to an increased focus placed on the development of post-marketing surveillance methods [1–11]. Post-marketing spontaneous reporting of suspected adverse drug reactions has proved a valuable resource for signal detection [12–17]. It has recently been suggested that the modeling of the time-to-onset of adverse drug reactions could be a useful adjunct to signal detection methods, either from spontaneous reports [18, 19] or longitudinal observational data [20]. Timely acquiring knowledge with respect to the time-to-onset distribution of adverse drug reactions contributes to meeting pharmacovigilance objectives. Early estimation procedures tailored to available pharmacovigilance data, i.e. spontaneous reporting data, should be sought.
This paper investigates parametric maximum likelihood estimation of the time-to-onset distribution of adverse drug reactions from spontaneous reporting data for different types of hazard functions likely to be encountered in pharmacovigilance. Acknowledgment of the developments adapted to right-truncated data is not widespread and these methods have never been used in pharmacovigilance. No simulation studies are available on the accuracy of their estimates. Furthermore, a naive approach that does not take into account right truncation features of spontaneous reports and uses classical parametric methods instead of appropriate methods may lead to misleading estimates. We consider the two approaches, i.e. taking or not taking right truncation into account, and the corresponding parametric maximum likelihood estimators. Both approaches are compared with a simulation study conducted to evaluate the consequences, notably in terms of bias, of not considering right truncation on the maximum likelihood estimates, as well as assessing the performances of the right truncation-based estimation. We also apply these methods to a set of 64 cases of lymphoma occurring after anti TNF- α treatment from the French pharmacovigilance.
Methods
Proper estimation of the time-to-onset distribution
We consider a given time of analysis and the population of exposed patients who will eventually experience the adverse drug reaction before they die. Let X be the time-to-onset of the adverse drug reaction of interest in that population and F its cumulative distribution function one is willing to estimate. Observations arising from n reported cases are (x _{1},t _{1}),(x _{2},t _{2}),…,(x _{ n },t _{ n }), where x _{ i }is the time-to-onset calculated as the lag between the time of the occurrence of the reaction and the time of initiation of treatment, and t _{ i }is the truncation time calculated as the lag between the time of analysis and the time of initiation of treatment. Let t ^{∗} be the maximum of the observed truncation times. All observed data meet the condition x _{ i }≤ t _{ i }.
We consider a parametric model for the time-to-onset X, with cumulative distribution function F (x; θ) and density f(x; θ), and derive the following maximum likelihood estimations of θ.
maximizing this likelihood yields the naive estimator of θ.
the maximum likelihood estimator from this likelihood, ${\hat{\theta}}_{\text{TBE}}$, is the proper estimation of θ and is called the truncation-based estimator (TBE).
where the v _{ j }’s are the m distinct values of the x _{ i }’s, i =1,…,n, taken by ${n}_{j}=\sum _{i=1}^{n}I({X}_{i}={v}_{j})$ patients and ${N}_{j}=\sum _{i=1}^{n}I({X}_{i}\le {v}_{j}\le {t}_{i})$ for 1 ≤ j ≤ m, I denoting the indicator function. The unconditional distribution function is not identifiable, as F (t ^{∗}) is not known and cannot be estimated from the data.
In a parametric framework, the unconditional distribution is completely specified by a parameter θ of finite dimension. Maximum likelihood estimation of the parameter of interest can be conducted with the conditional distributions that describe the observations and the unconditional distribution can be estimated secondarily by $F(x;{\hat{\theta}}_{\text{TBE}})$. Hence parametric maximum likelihood estimation is potentially more useful than non-parametric estimation since the unconditional distribution is of interest for pharmacovigilance purposes [18, 20].
Simulation study
Exponential, Weibull and log-logistic distributions
Distribution | Exponential | Weibull | Log-logistic |
---|---|---|---|
Density | f (x) = λ e ^{-λ x } | $\left(x\right)=\lambda \beta {(\lambda x)}^{\beta -1}{e}^{(-{(\lambda x)}^{\beta})}$ | $f\left(x\right)=\frac{\lambda \beta {(\lambda x)}^{\beta -1}}{{(1+{(\lambda x)}^{\beta})}^{2}}$ |
Support | x > 0 | x > 0 | x > 0 |
Parameter(s) | λ > 0 | λ > 0 | λ > 0 |
β > 0 | β > 0 |
The times-to-onset were generated from these three distributions. Two values of λ were considered for the exponential distribution: 0.05 and 1. The same values were used for the scale parameter λ of the Weibull and log-logistic distributions. For the shape parameter β, the values 0.5 and 2 were chosen. The truncation times were uniformly distributed in [0,τ]. Survival and truncation times were independently generated. For a chosen value of p, with p representing the probability of X falling within the observable values interval [0, τ], the parameter τ was determined as P (X< τ) = p. The probability 1 - p is also a lower bound of the actual proportion of truncated data P (X> T), the truncation time T being randomly generated. The probability p was chosen in {0.25, 0.50, 0.80}. The sample size n was chosen in {100, 500}. For each drawn pair (X,T), if the time-to-onset was shorter than the truncation time, then the pair was included in the data. If not, another pair (X,T) was generated. Pairs were generated until the sample size of observations included was equal to n.
Parametric likelihood maximization with and without considering right truncation were performed for each generated sample. An iterative algorithm is necessary to solve this optimization problem except for the naive exponential estimation. Calculations were made with the R [24] function maxLik from the package maxLik. For each set of simulation parameters, 1000 replications were run.
Application study
We analyzed 64 French cases of lymphoma that occurred after anti TNF- α treatment using the national pharmacovigilance database at the date of February 1, 2010 [25]. The population included patients suffering from rheumatoid arthritis, Crohn’s disease, ankylosing spondylitis, psoriatic arthritis, psoriasis, Sjögren’s syndrome, dermatomyositis, polymyositis or polyarthropathy and exposed to one or (successively) more of the three anti TNF- α available at the study date: etanercept, adalimumab and infliximab. The occurrence of a malignant lymphoma was confirmed by histopathological analysis. Marketing authorization was obtained in August 1999 for infliximab, in September 2002 for etanercept and in September 2003 for adalimumab. These 64 adverse effects occurred between July 2001 and October 2009. None of the survival or truncation times was missing in the database. The observed maximum truncation time was 529 weeks.
All anti TNF-agents taken together, we derived the parametric maximum likelihood estimates and secondarily corresponding estimated mean times, with and without considering right truncation, for the exponential, Weibull and log-logistic distributions. For completeness, we also derived the non-parametric maximum likelihood estimation.
The French pharmacovigilance database is developed by the French drug agency (Agence Nationale de Sécurité du Médicament et des produits de santé, ANSM) and is not publicly available. It is build up and used on an ongoing basis by the network of regional pharmacovigilance centres, which have a direct access to the data. This set of data has already been extracted for another study [25] with the authorization of the ANSM and the network of regional centres, according to the internal rule.
Results
Simulation study
For each set of simulations parameters, for both approaches and for both parameters, the bias and the mean squared error of the parametric maximum likelihood estimator, based on the 1000 replications, were calculated as well as the proportion of replications where the estimate is larger than the true value. As the iterative algorithm may fail to find a maximum, those three quantities were actually calculated on the replications where there was no problem of maximization. The mean squared error is a measure of the dispersion of the estimator around the true value of the parameter - the smaller the better - and is used for global comparative purposes between two estimation procedures, as it incorporates both the variance of the estimator and its bias. The proportion of replications where the estimate is larger than the true value makes it possible to know if the estimators tend to overestimate or underestimate systematically the true value of the parameter.
Bias and mean squared error
Simulation results: estimations of bias and mean squared error for the exponential model
Naive estimator | TBE | ||||||
---|---|---|---|---|---|---|---|
λ | p | n | BIAS($\hat{\lambda}$) | MSE($\hat{\lambda}$) | BIAS($\hat{\lambda}$) | MSE($\hat{\lambda}$) | NPM |
0.05 | 0.25 | 100 | 0.498 | 0.250 | 0.030 | 0.005 | 224 |
500 | 0.498 | 0.248 | 0.007 | 0.001 | 79 | ||
0.05 | 0.50 | 100 | 0.195 | 0.038 | 0.008 | 0.001 | 85 |
500 | 0.193 | 0.037 | <0.001 | <0.001 | 1 | ||
0.05 | 0.80 | 100 | 0.073 | 0.005 | <0.001 | <0.001 | 2 |
500 | 0.072 | 0.005 | <0.001 | <0.001 | 0 | ||
1 | 0.25 | 100 | 10.06 | 102 | 0.462 | 2.17 | 72 |
500 | 9.95 | 99 | 0.046 | 0.48 | 10 | ||
1 | 0.50 | 100 | 3.91 | 15.4 | 0.126 | 0.49 | 29 |
500 | 3.86 | 14.9 | -0.022 | 0.12 | 0 | ||
1 | 0.80 | 100 | 1.45 | 2.16 | 0.004 | 0.11 | 0 |
500 | 1.45 | 2.11 | 0.004 | 0.02 | 0 |
Simulation results: estimations of bias and mean squared error for the Weibull model
Naive estimator | TBE | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
$\hat{\lambda}$ | $\hat{\beta}$ | $\hat{\lambda}$ | $\hat{\beta}$ | |||||||||
λ | β | p | n | BIAS | MSE | BIAS | MSE | BIAS | MSE | BIAS | MSE | NPM |
0.05 | 0.5 | 0.25 | 100 | 4.04 | 16.7 | 0.200 | 0.044 | 0.465 | 0.51 | 0.046 | 0.007 | 312 |
500 | 3.95 | 15.6 | 0.195 | 0.039 | 0.106 | 0.04 | 0.013 | 0.001 | 201 | |||
0.05 | 0.5 | 0.50 | 100 | 0.762 | 0.60 | 0.167 | 0.031 | 0.068 | 0.018 | 0.024 | 0.005 | 172 |
500 | 0.747 | 0.56 | 0.164 | 0.028 | 0.015 | 0.003 | 0.003 | 0.001 | 22 | |||
0.05 | 0.5 | 0.80 | 100 | 0.160 | 0.027 | 0.119 | 0.017 | 0.008 | 0.002 | 0.009 | 0.004 | 9 |
500 | 0.156 | 0.025 | 0.113 | 0.013 | 0.001 | <0.001 | 0.001 | <0.001 | 0 | |||
1 | 0.5 | 0.25 | 100 | 80.4 | 6612 | 0.201 | 0.044 | 8.68 | 183 | 0.046 | 0.007 | 300 |
500 | 78.9 | 6249 | 0.194 | 0.038 | 2.07 | 17 | 0.012 | 0.001 | 186 | |||
1 | 0.5 | 0.50 | 100 | 15.0 | 233 | 0.174 | 0.034 | 1.53 | 7.99 | 0.031 | 0.006 | 163 |
500 | 15.0 | 225 | 0.164 | 0.028 | 0.32 | 1.17 | 0.003 | 0.001 | 24 | |||
1 | 0.5 | 0.80 | 100 | 3.20 | 10.8 | 0.117 | 0.017 | 0.16 | 0.67 | 0.007 | 0.004 | 13 |
500 | 3.15 | 10.0 | 0.112 | 0.013 | 0.041 | 0.15 | <0.001 | <0.001 | 0 | |||
0.05 | 2 | 0.25 | 100 | 0.121 | 0.015 | 0.354 | 0.16 | <0.001 | 0.002 | 0.097 | 0.075 | 8 |
500 | 0.120 | 0.014 | 0.333 | 0.12 | -0.004 | 0.001 | 0.020 | 0.016 | 2 | |||
0.05 | 2 | 0.50 | 100 | 0.065 | 0.004 | 0.278 | 0.11 | -0.004 | <0.001 | 0.047 | 0.074 | 6 |
500 | 0.064 | 0.004 | 0.264 | 0.08 | -0.002 | <0.001 | 0.004 | 0.016 | 0 | |||
0.05 | 2 | 0.80 | 100 | 0.032 | 0.001 | 0.182 | 0.063 | <0.001 | <0.001 | 0.046 | 0.063 | 1 |
500 | 0.032 | 0.001 | 0.157 | 0.031 | <0.001 | <0.001 | 0.008 | 0.014 | 0 | |||
1 | 2 | 0.25 | 100 | 2.41 | 5.84 | 0.364 | 0.17 | 0.090 | 0.79 | 0.10 | 0.075 | 1 |
500 | 2.41 | 5.79 | 0.336 | 0.12 | -0.082 | 0.38 | 0.02 | 0.015 | 0 | |||
1 | 2 | 0.50 | 100 | 1.29 | 1.68 | 0.283 | 0.12 | -0.073 | 0.33 | 0.052 | 0.069 | 3 |
500 | 1.29 | 1.65 | 0.261 | 0.07 | -0.065 | 0.12 | -0.002 | 0.017 | 0 | |||
1 | 2 | 0.80 | 100 | 0.638 | 0.41 | 0.186 | 0.065 | -0.024 | 0.086 | 0.045 | 0.064 | 0 |
500 | 0.636 | 0.40 | 0.154 | 0.030 | -0.007 | 0.014 | 0.004 | 0.013 | 0 |
Simulation results: estimations of bias and mean squared error for the log-logistic model
Naive estimator | TBE | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
$\hat{\lambda}$ | $\hat{\beta}$ | $\hat{\lambda}$ | $\hat{\beta}$ | |||||||||
λ | β | p | n | BIAS | MSE | BIAS | MSE | BIAS | MSE | BIAS | MSE | NPM |
0.05 | 0.5 | 0.25 | 100 | 6.45 | 44 | 0.384 | 0.16 | 0.258 | 0.25 | 0.041 | 0.008 | 217 |
500 | 6.33 | 40 | 0.372 | 0.14 | 0.043 | 0.01 | 0.005 | 0.001 | 52 | |||
0.05 | 0.5 | 0.50 | 100 | 1.05 | 1.2 | 0.319 | 0.108 | 0.045 | 0.012 | 0.020 | 0.006 | 22 |
500 | 1.02 | 1.1 | 0.308 | 0.096 | 0.009 | 0.001 | 0.003 | 0.001 | 0 | |||
0.05 | 0.5 | 0.80 | 100 | 0.165 | 0.031 | 0.195 | 0.041 | 0.008 | 0.001 | 0.008 | 0.004 | 0 |
500 | 0.158 | 0.026 | 0.189 | 0.036 | 0.001 | <0.001 | 0.001 | <0.001 | 0 | |||
1 | 0.5 | 0.25 | 100 | 129 | 17533 | 0.383 | 0.15 | 5.06 | 87 | 0.042 | 0.008 | 207 |
500 | 127 | 16217 | 0.374 | 0.14 | 1.01 | 6 | 0.008 | 0.001 | 41 | |||
1 | 0.5 | 0.50 | 100 | 21.0 | 467 | 0.317 | 0.106 | 0.93 | 5.0 | 0.019 | 0.006 | 43 |
500 | 20.5 | 426 | 0.308 | 0.096 | 0.20 | 0.6 | 0.004 | 0.001 | 0 | |||
1 | 0.5 | 0.80 | 100 | 3.31 | 12 | 0.201 | 0.044 | 0.209 | 0.55 | 0.016 | 0.005 | 0 |
500 | 3.17 | 10 | 0.190 | 0.037 | 0.037 | 0.09 | 0.002 | <0.001 | 0 | |||
0.05 | 2 | 0.25 | 100 | 0.150 | 0.022 | 1.06 | 1.2 | <0.001 | 0.001 | 0.08 | 0.085 | 4 |
500 | 0.149 | 0.022 | 1.04 | 1.1 | -0.001 | <0.001 | 0.01 | 0.018 | 0 | |||
0.05 | 2 | 0.50 | 100 | 0.079 | 0.006 | 0.932 | 0.94 | <0.001 | <0.001 | 0.06 | 0.094 | 5 |
500 | 0.078 | 0.006 | 0.903 | 0.83 | <0.001 | <0.001 | 0.01 | 0.017 | 0 | |||
0.05 | 2 | 0.80 | 100 | 0.035 | 0.001 | 0.665 | 0.50 | <0.001 | <0.001 | 0.03 | 0.078 | 0 |
500 | 0.035 | 0.001 | 0.649 | 0.43 | <0.001 | <0.001 | 0.01 | 0.013 | 0 | |||
1 | 2 | 0.25 | 100 | 2.99 | 9.0 | 1.07 | 1.2 | 0.024 | 0.57 | 0.08 | 0.089 | 0 |
500 | 2.98 | 8.9 | 1.04 | 1.1 | -0.028 | 0.20 | 0.01 | 0.020 | 0 | |||
1 | 2 | 0.50 | 100 | 1.57 | 2.49 | 0.943 | 0.96 | 0.007 | 0.19 | 0.063 | 0.095 | 1 |
500 | 1.56 | 2.45 | 0.896 | 0.82 | -0.013 | 0.04 | 0.004 | 0.018 | 0 | |||
1 | 2 | 0.80 | 100 | 0.702 | 0.50 | 0.668 | 0.50 | 0.004 | 0.042 | 0.045 | 0.072 | 0 |
500 | 0.693 | 0.48 | 0.648 | 0.43 | 0.004 | 0.007 | 0.015 | 0.013 | 0 |
Proportion of replications where the estimator is larger than the true value
Simulation results: proportion of replications where the maximum likelihood estimator is larger than the true value of the parameter for the exponential model
λ | p | n | Naive estimator | TBE |
---|---|---|---|---|
0.05 | 0.25 | 100 | 100% | 61.6% |
500 | 100% | 55.3% | ||
0.05 | 0.50 | 100 | 100% | 55.3% |
500 | 100% | 50.4% | ||
0.05 | 0.80 | 100 | 100% | 51.1% |
500 | 100% | 51.7% | ||
1 | 0.25 | 100 | 100% | 54.8% |
500 | 100% | 50.7% | ||
1 | 0.50 | 100 | 100% | 53.2% |
500 | 100% | 48.0% | ||
1 | 0.80 | 100 | 100% | 50.0% |
500 | 100% | 51.0% |
Simulation results: proportion of replications where the maximum likelihood estimator is larger than the true value of the parameter for the Weibull model
Naive estimator | TBE | ||||||
---|---|---|---|---|---|---|---|
λ | β | p | n | $\hat{\lambda}>\lambda $ | $\hat{\beta}>\beta $ | $\hat{\lambda}>\lambda $ | $\hat{\beta}>\beta $ |
0.05 | 0.5 | 0.25 | 100 | 100% | 100% | 81.4% | 71.9% |
500 | 100% | 100% | 64.6% | 64.5% | |||
0.05 | 0.5 | 0.50 | 100 | 100% | 100% | 63.3% | 60.1% |
500 | 100% | 100% | 53.4% | 51.0% | |||
0.05 | 0.5 | 0.80 | 100 | 100% | 99.6% | 52.0% | 53.3% |
500 | 100% | 100% | 48.6% | 51.6% | |||
1 | 0.5 | 0.25 | 100 | 100% | 100% | 79.3% | 76.0% |
500 | 100% | 100% | 62.0% | 61.2% | |||
1 | 0.5 | 0.50 | 100 | 100% | 100% | 65.9% | 64.6% |
500 | 100% | 100% | 53.8% | 51.8% | |||
1 | 0.5 | 0.80 | 100 | 100% | 99.5% | 52.7% | 52.2% |
500 | 100% | 100% | 51.9% | 50.6% | |||
0.05 | 2 | 0.25 | 100 | 100% | 98.1% | 52.1% | 61.6% |
500 | 100% | 100% | 52.2% | 53.7% | |||
0.05 | 2 | 0.50 | 100 | 100% | 94.2% | 51.6% | 53.3% |
500 | 100% | 100% | 50.6% | 51.0% | |||
0.05 | 2 | 0.80 | 100 | 100% | 85.4% | 56.1% | 55.8% |
500 | 100% | 97.9% | 52.2% | 49.6% | |||
1 | 2 | 0.25 | 100 | 100% | 98.2% | 56.2% | 62.5% |
500 | 100% | 99.9% | 50.1% | 54.8% | |||
1 | 2 | 0.50 | 100 | 100% | 94.3% | 53.9% | 54.2% |
500 | 100% | 99.9% | 47.1% | 48.1% | |||
1 | 2 | 0.80 | 100 | 100% | 85.3% | 54.1% | 54.2% |
500 | 100% | 97.9% | 52.7% | 52.2% |
Simulation results: proportion of replications where the maximum likelihood estimator is larger than the true value of the parameter for the log-logistic model
1Naive estimator | TBE | ||||||
---|---|---|---|---|---|---|---|
λ | β | p | n | $\hat{\lambda}>\lambda $ | $\hat{\beta}>\beta $ | $\hat{\lambda}>\lambda $ | $\hat{\beta}>\beta $ |
0.05 | 0.5 | 0.25 | 100 | 100% | 100% | 67.2% | 67.7% |
500 | 100% | 100% | 53.6% | 52.0% | |||
0.05 | 0.5 | 0.50 | 100 | 100% | 100% | 55.4% | 57.5% |
500 | 100% | 100% | 51.1% | 52.0% | |||
0.05 | 0.5 | 0.80 | 100 | 100% | 100% | 51.1% | 53.2% |
500 | 100% | 100% | 50.8% | 51.5% | |||
1 | 0.5 | 0.25 | 100 | 100% | 100% | 67.7% | 66.1% |
500 | 100% | 100% | 55.9% | 56.1% | |||
1 | 0.5 | 0.50 | 100 | 100% | 100% | 54.9% | 57.2% |
500 | 100% | 100% | 53.4% | 53.4% | |||
1 | 0.5 | 0.80 | 100 | 100% | 100% | 55.1% | 56.5% |
500 | 100% | 100% | 51.9% | 52.0% | |||
0.05 | 2 | 0.25 | 100 | 100% | 100% | 53.2% | 55.9% |
500 | 100% | 100% | 51.8% | 51.8% | |||
0.05 | 2 | 0.50 | 100 | 100% | 100% | 55.0% | 54.2% |
500 | 100% | 100% | 53.3% | 52.2% | |||
0.05 | 2 | 0.80 | 100 | 100% | 100% | 50.3% | 51.5% |
500 | 100% | 100% | 53.9% | 54.4% | |||
1 | 2 | 0.25 | 100 | 100% | 100% | 52.7% | 56.1% |
500 | 100% | 100% | 53.3% | 51.0% | |||
1 | 2 | 0.50 | 100 | 100% | 100% | 54.3% | 56.4% |
500 | 100% | 100% | 50.1% | 49.5% | |||
1 | 2 | 0.80 | 100 | 100% | 100% | 52.0% | 53.7% |
500 | 100% | 100% | 52.9% | 55.0% |
Application study
Parameter estimation and estimated mean time-to-onset for 64 cases of lymphoma that occurred after anti TNF- α treatment
Naive estimator | TBE | |||||||
---|---|---|---|---|---|---|---|---|
Distribution | $\hat{\lambda}$ | $\hat{\beta}$ | Expectation (weeks) | $\hat{\lambda}$ | $\hat{\beta}$ | $\hat{p}$ | Expectation (weeks) | |
Exponential | 0.00739 | - | 135 | 0.00172 | - | 0.60 | 581 | [264,7528]^{*} |
Weibull | 0.00666 | 1.55 | 135 | 0.00468 | 1.49 | 0.98 | 193 | [150,432]^{*} |
Log-logistic | 0.00890 | 2.06 | 171 | 0.00408 | 1.53 | 0.76 | 567 | [207,1.8 ×10^{12}]^{*} |
Discussion and conclusions
In drug safety assessment, the temporal relationship between drug administration and time-to-onset is of utmost relevance. A better understanding of the underlying mechanism of the occurrence of an adverse effect is crucial, as it could allow the identification of particular groups of patients at risk and of particular risk time-windows in the course of a treatment and lead to preventing or diagnosing earlier the occurrence of adverse reactions. In this framework, the time-to-onset of an adverse drug reaction constitutes an essential feature to be analyzed. Its accurate estimation and modeling could help in understanding the mechanism of a drug’s action.
As rare adverse effects are not generally identified by cohort studies of exposed patients but from spontaneous reporting systems, we investigated with a simulation study the accuracy of estimates that can be obtained from these data in a parametric framework. As one can only estimate a conditional distribution function in a non-parametric setting, the non-parametric maximum likelihood estimator is of rather little interest for pharmacovigilance people. For a finite sample size, the simulations show that, whatever the approach, naive or truncation-based, the parametric maximum likelihood estimator may be positively biased and that this bias and the corresponding mean squared error increase when the theoretical probability p for the time-to-onset to fall within the observable values interval decreases. However, for a fixed value of p, the bias and the mean squared error are always larger when the right truncation is not considered than when it is, and the gap may be large. In addition, bias and mean squared error might in some instances (Weibull, log-logistic) be unacceptably large for the naive approach, even for a large value of p, while with a probability p of 0.8, or sometime even less, the TBE shows good performances. Asymptotically, the naive estimator may not be unbiased because the bias and the mean squared error seem to be constant with the sample size and the maximization is based on a misleading likelihood, while the bias and the mean squared error for the TBE decrease as the sample size increases. Therefore, even if the sample size is large, the gap between both estimators does not disappear and the truncation-based approach should be used.
The probability p plays an important role in the estimation of the distribution of the time-to-onset of adverse reaction for right-truncated data. Knowledge exists on a range of possible pharmacological mechanisms. It is thus possible to get a rough idea of the fraction of potentially missed cases (the adverse reactions of treated patients that have yet to occur) and then to decide on the relevance of the time of analysis. Spontaneous reports result from three processes: the occurrence case process, its diagnosis and the reporting process. It is well known that under-reporting is widespread, even for serious events. In addition, factors of under-reporting include the seriousness of the effect, the age of the patient and the novelty of the effect, but also time-related variables such as the length of marketing or the time since exposure [28–33]. In the approach proposed here, it is assumed that the under-reporting is uniform. Such a hypothesis might not always be acceptable. However, with long-term effects such as lymphoma and a homogeneous observation period within the marketing life of the product, non-stationarity of reporting is unlikely.
Problems of maximization may arise when right truncation is taken into account. The smaller is p, the more the iterative algorithm is likely to fail. Some papers mentioned the existence of a problem in the parametric likelihood maximization and explained that, because of right truncation, the likelihood may be flat and the maximum may be difficult to find [21, 34–36].
For the 64 cases of lymphoma after anti TNF- α treatment, there was no problem of convergence of the iterative algorithm. Both estimates, naive and truncation-based, were available for each fitted model. From the truncation-based estimates, it is possible to estimate p. Here it ranges from 0.98 (Weibull) to 0.60 (exponential). Since this probability is unknown, the non-parametric maximum likelihood estimation estimates only the distribution function conditional on the time-to-event being less than the maximum observed truncation time. However, although conditional, the non-parametric estimate is a reference that provides an idea of how the data fit a given model. We followed the graphical procedure for checking goodness-of-fit for right-truncated data suggested by Lawless (2003) that is based on the non-parametric maximum likelihood estimator and consists in plotting the conditional fitted parametric survivals together with the non-parametric estimation [36]. Here, the conditional Weibull survival function seems the closest to the non-parametric estimation. This finding underlines the interest for developing goodness-of-fit tests adapted to right-truncated data. While only three families of distributions were considered for the present simulation study, other families could be explored such as the gamma or the log-normal families or mixture models. For instance, in more complex situations, the treatment might be a combination of drugs, each of them inducing the effect but in a different time window. In that case, the hazard function may vary several times and a family of more complex distributions could be of greater interest. Additionally, we chose to consider the truncation times as deterministic, which is equivalent to working on conditional distributions for the likelihood. However, another possible approach is to consider the truncation time as a random variable and to study the random pair (X,T) where X is the survival time and T is the truncation time [37–39].
Finally, improvement of time-to-onset distribution assessment could make it possible to compare two drug profiles or more generally to assess risk factors with regression models.
Declarations
Acknowledgements
This work was supported by the Fondation ARC (fellowship DOC20121206119 to Fanny Leroy).
Authors’ Affiliations
References
- Fourrier A, Bégaud B, Alpérovitch A, Verdier-Taillefer M-H, Decker N, Imbs J-L, Touzé E: Hepatitis B vaccine and first episodes of central nervous system demyelinating disorders: a comparison between reported and expected number of cases. Br J Clin Pharmacol. 2001, 51 (5): 489-490.View ArticlePubMedPubMed CentralGoogle Scholar
- Tubert P, Bégaud B, Haramburu F, Péré JC: Spontaneous reporting: how many cases are required to trigger a warning?. Br J Clin Pharmacol. 1991, 32 (4): 407-408. 10.1111/j.1365-2125.1991.tb03922.x.View ArticlePubMedPubMed CentralGoogle Scholar
- Moore N, Kreft-Jais C, Haramburu F, Noblet C, Andrejak M, Ollagnier M, Bégaud B: Reports of hypoglycaemia associated with the use of ACE inhibitors and other drugs: a case/non-case study in the French pharmacovigilance system database. Br J Clin Pharmacol. 1997, 44 (5): 513-518.View ArticlePubMedPubMed CentralGoogle Scholar
- Tubert-Bitter P, Bégaud B, Moride Y, Chaslerie A, Haramburu F: Comparing the toxicity of two drugs in the framework of spontaneous reporting: a confidence interval approach. J Clin Epidemiol. 1996, 49 (1): 121-123. 10.1016/0895-4356(95)00537-4.View ArticlePubMedGoogle Scholar
- van der Heijden PG, van Puijenbroek EP, van Buuren S, van der Hofstede JW: On the assessment of adverse drug reactions from spontaneous reporting systems: the influence of under-reporting on odds ratios. Stat Med. 2002, 21 (14): 2027-2044. 10.1002/sim.1157.View ArticlePubMedGoogle Scholar
- Bate A, Lindquist M, Edwards IR, Olsson S, Orre R, Lansner A, De Freitas RM: A bayesian neural network method for adverse drug reaction signal generation. Eur J Clin Pharmacol. 1998, 54 (4): 315-321. 10.1007/s002280050466.View ArticlePubMedGoogle Scholar
- DuMouchel W: Bayesian data mining in large frequency tables, with an application to the FDA spontaneous reporting system. Am Stat. 1999, 53 (3): 177-190.Google Scholar
- Szarfman A, Machado SG, O’Neill RT: Use of screening algorithms and computer systems to efficiently signal higher-than-expected combinations of drugs and events in the US FDA’s spontaneous reports database. Drug Saf. 2002, 25 (6): 381-392. 10.2165/00002018-200225060-00001.View ArticlePubMedGoogle Scholar
- Evans SJW, Waller PC, Davis S: Use of proportional reporting ratios (PRRs) for signal generation from spontaneous adverse drug reaction reports. Pharmacoepidemiol Drug Saf. 2001, 10 (6): 483-486. 10.1002/pds.677.View ArticlePubMedGoogle Scholar
- Ahmed I, Haramburu F, Fourrier-Réglat A, Thiessard F, Kreft-Jais C, Bégaud B, Tubert-Bitter P, Miremont-Salamé G: Bayesian pharmacovigilance signal detection methods revisited in a multiple comparison setting. Stat Med. 2009, 28 (13): 1774-1792. 10.1002/sim.3586.View ArticlePubMedGoogle Scholar
- Ahmed I, Dalmasso C, Haramburu F, Thiessard F, Broët P, Tubert-Bitter P: False discovery rate estimation for frequentist pharmacovigilance signal detection methods. Biometrics. 2010, 66 (1): 301-309. 10.1111/j.1541-0420.2009.01262.x.View ArticlePubMedGoogle Scholar
- Roux E, Thiessard F, Fourrier A, Bégaud B, Tubert-Bitter P: Evaluation of statistical association measures for the automatic signal generation in pharmacovigilance. IEEE Trans Inf Technol Biomed. 2005, 9 (4): 518-527.View ArticlePubMedGoogle Scholar
- Ahmed I, Thiessard F, Bégaud B, Tubert-Bitter P, Miremont-Salamé G: Pharmacovigilance data mining with methods based on false discovery rates: a comparative simulation study. Clin Pharmacol Ther. 2010, 88 (4): 492-498. 10.1038/clpt.2010.111.View ArticlePubMedGoogle Scholar
- Bate A, Evans SJW: Quantitative signal detection using spontaneous ADR reporting. Pharmacoepidemiol Drug Saf. 2009, 18 (6): 427-436. 10.1002/pds.1742.View ArticlePubMedGoogle Scholar
- Alvarez Y, Hidalgo A, Maignen F, Slattery J: Validation of statistical signal detection procedures in eudravigilance post-authorization data: a retrospective evaluation of the potential for earlier signalling. Drug Saf. 2010, 33 (6): 475-487. 10.2165/11534410-000000000-00000.View ArticlePubMedGoogle Scholar
- Hochberg AM, Hauben M: Time-to-signal comparison for drug safety data-mining algorithms vs. traditional signaling criteria. Clin Pharmacol Ther. 2009, 85 (6): 600-606. 10.1038/clpt.2009.26.View ArticlePubMedGoogle Scholar
- Ahmed I, Thiessard F, Haramburu F, Kreft-Jais C, Bégaud B, Tubert-Bitter P, Miremont-Salamé G: Early detection of pharmacovigilance signals with automated methods based on false discovery rates: a comparative study. Drug Saf. 2012, 35 (6): 495-506. 10.2165/11597180-000000000-00000.View ArticlePubMedGoogle Scholar
- Maignen F, Hauben M, Tsintis P: Modelling the time to onset of adverse reactions with parametric survival distributions. Drug Saf. 2010, 33 (5): 417-434. 10.2165/11532850-000000000-00000.View ArticlePubMedGoogle Scholar
- Van Holle L, Zeinoun Z, Bauchau V, Verstraeten T: Using time-to-onset for detecting safety signals in spontaneous reports of adverse events following immunization: a proof of concept study. Pharmacoepidemiol Drug Saf. 2012, 21 (6): 603-610. 10.1002/pds.3226.View ArticlePubMedGoogle Scholar
- Cornelius VR, Sauzet O, Evans SJW: A signal detection method to detect adverse drug reactions using a parametric time-to-event model in simulated cohort data. Drug Saf. 2012, 35 (7): 599-610. 10.2165/11599740-000000000-00000.View ArticlePubMedGoogle Scholar
- Lagakos SW, Barraj LM, De Gruttola V: Nonparametric analysis of truncated survival data, with application to aids. Biometrika. 1988, 75 (3): 515-523. 10.1093/biomet/75.3.515.View ArticleGoogle Scholar
- Kalbfleisch JD, Lawless JF: Regression models for right truncated data with applications to AIDS incubation times and reporting lags. Stat Sin. 1991, 1: 19-32.Google Scholar
- Bégaud B, Miremont G, Péré JC: Estimation of the denominator in spontaneous reporting. Methodological Approaches in Pharmacoepidemiology: Application to Spontaneous Reporting. 1993, Amsterdam: Elsevier, 51-70.Google Scholar
- R Development Core Team R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria, [http://cran.r-project.org/]
- Théophile H, Schaeverbeke T, Abouelfath A, Kahn V, Haramburu F, Bégaud B, Miremont-Salamé G: Sources of information on lymphoma associated with anti-tumour necrosis factor agents. Drug Saf. 2011, 34 (7): 577-585. 10.2165/11590200-000000000-00000.View ArticlePubMedGoogle Scholar
- Efron B, Tibshirani RJ: An Introduction to the Bootstrap. 1993, New York: Chapman & HallView ArticleGoogle Scholar
- Gross ST, Lai TL: Bootstrap methods for truncated and censored data. Stat Sin. 1996, 6: 509-530.Google Scholar
- Weber JCP: Mathematical models in adverse drug reaction assessment. Iatrogenic Diseases. 3rd Ed. Edited by: Arcy PF, Griffin JP. 1986, Oxford: Oxford University Press,Google Scholar
- Tubert-Bitter P, Haramburu F, Bégaud B, Chaslerie A, Abraham E, Hagry C: Spontaneous reporting of adverse drug reactions: who reports and what?. Pharmacoepidemiol Drug Saf. 1998, 7 (5): 323-329. 10.1002/(SICI)1099-1557(199809/10)7:5<323::AID-PDS374>3.0.CO;2-8.View ArticlePubMedGoogle Scholar
- Haramburu F Bégaud, Moride Y: Temporal trends in spontaneous reporting of unlabelled adverse drug reactions. Br J Clin Pharmacol. 1997, 44 (3): 299-301.View ArticlePubMedGoogle Scholar
- Moride Y, Haramburu F, Requejo AA, Bégaud B: Under-reporting of adverse drug reactions in general practice. Br J of Clin Pharmacol. 1997, 43 (2): 177-181.View ArticleGoogle Scholar
- Bégaud B, Martin K, Haramburu F, Moore N: Rates of spontaneous reporting of adverse drug reactions in France (letter). JAMA. 2002, 288 (13): 1588-1588. 10.1001/jama.288.13.1588.View ArticlePubMedGoogle Scholar
- Tubert P, Bégaud B, Haramburu F, Lellouch J, Péré J-C: Power and weakness of spontaneous reporting: a probabilistic approach. J Clin Epidemiol. 1992, 45 (3): 283-286. 10.1016/0895-4356(92)90088-5.View ArticlePubMedGoogle Scholar
- Kalbfleisch JD, Lawless JF: Inference based on retrospective ascertainment: an analysis of the data on transfusion-related AIDS. J Am Stat Assoc. 1989, 84 (406): 360-372. 10.1080/01621459.1989.10478780.View ArticleGoogle Scholar
- Colton T: Biased Sampling of Cohorts in Epidemiology.Encyclopedia of Biostatistics, Vol 1. Edited by: Armitage P, Colton T. 1998, Chichester: Wiley, 338-350.Google Scholar
- Lawless JF: Statistical Models and Methods for Lifetime Data, 2nd Ed. 2003, Hokoben, New Jersey: WileyGoogle Scholar
- Keiding N: Nonparametric estimation under truncation.Encyclopedia of Statistical Sciences, Vol 14, 2nd Ed. 2006, Hokoben, New Jersey: Wiley, 8775-8777.Google Scholar
- Gürler Ü: Bivariate estimation with right-truncated data. J Am Stat Assoc. 1996, 91 (435): 1152-1165.Google Scholar
- Gross ST, Huber-Carol C: Regression models for truncated survival data. Scandinavian J Stat. 1992, 193-213.Google Scholar
- The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1471-2288/14/17/prepub
Pre-publication history
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.