 Research article
 Open Access
 Published:
Disease progression of cancer patients during COVID19 pandemic: a comprehensive analytical strategy by timedependent modelling
BMC Medical Research Methodology volume 20, Article number: 209 (2020)
Abstract
Background
As the whole world is experiencing the cascading effect of a new pandemic, almost every aspect of modern life has been disrupted. Because of health emergencies during this period, widespread fear has resulted in compromised patient safety, especially for patients with cancer. It is very challenging to treat such cancer patients because of the complexity of providing care and treatment, along with COVID19. Hence, an effective treatment comparison strategy is needed. We need to have a handy tool to understand cancer progression in this unprecedented scenario. Linking different events of cancer progression is the need of the hour. It is a huge challenge for the development of new methodology.
Methods
This article explores the time lag effect and makes a statistical inference about the best experimental arm using Accelerated Failure Time (AFT) model and regression methods. The work is presented as the occurrence of other events as a hazard rate after the first event (relapse). The time lag effect between the events is linked and analysed.
Results
The results were presented as a comprehensive analytical strategy by joining all disease progression. An AFT model applied with the transition states, and the dependency structure between the gap times was used by the autoregression model. The effects of arms were compared using the coefficient of autoregression and accelerated failure time (AFT) models.
Conclusions
We provide the solutions to overcome the issue with intervals between two consecutive events in motivating head and neck cancer (HNC) data. COVID19 is not going to leave us soon. We have to conduct several cancer clinical trials in the presence of COVID19. A comprehensive analytical strategy to analyse cancer clinical trial data during COVID19 pandemic is presented.
Background
Cancer patients are more prone to develop COVID19 because they are immunocompromised [1]. Studies have suggested that cancer patients are more susceptible to Coronavirus, whereas individuals without cancer are immunosuppressed. Though the risk of COVID19 infection varies individually, cancer patients require continuous care and treatment intervention and potential risk of COVID19 exposure could be fatal [2]. Studies have shown that COVID19 has created a great challenge to manage the cancer care delivery system [3].
It is essential to assess the patient’s risk of both COVID19 and tumour control on a casebycase basis with the patient. Conventionally, the treatment effect of head and neck cancer (HNC) is explored by multiple events like locoregional control (LRC), progressionfree survival (PFS), and overall survival (OS). These events are analysed separately by KaplanMeier [4] and the Cox Proportional Hazard (CPH) models [5]. Currently, it is difficult to isolate the reason for death due to Coronavirus or disease progression among cancer patients [6]. Similarly, all ongoing cancer clinical trials cannot stop due to COVID19 in the long run, and it is challenging to conduct cancer clinical trials [7] in this present environment. Thus, time lag/intervals between different types of events are essential to explore.
In this manuscript, we focused on exploring the time lag effect and studied the statistical inference about the best experimental arm using Accelerated Failure Time (AFT) Model and regression methods. We present our work here for the occurrence of other events as a hazard rate after the first event (relapse). It is known that local relapse biologically triggers cancer progression and death; however, in this study, we have not considered it. As most of the events are likely to be influenced by COVID19 infection, so it required to establish an integrated analysis.
The relapse triggers disease progression, and further, disease progression accelerates death rate. The study considered twotime points generated as the duration between relapse to progression and duration between progress to death. For these transition periods, we used the CPH and AFT model, which are useful to work on transition states where treatment effect is comparable.
In this study, the statistical model was considered to handle both the previously mentioned time points and explore the relations between gap durations. Further, we applied a CPH model to understand the different types of transition hazard models and the timevarying covariates considered separately. The results presented as a comprehensive analytical strategy. An AFT model applied with the transition states, and we explained the dependency structure between the gap times using autoregression. The effects of arms compared using the coefficient of autoregression and AFT models  the complete analysis using Bayesian techniques executed with R opensource software and OpenBUGS.
Methods
Dependency modelling
It is difficult to reduce risk and prevent the spread of the COVID19 virus among vulnerable cancer patients. At the same time, we have to provide treatment to all these several thousand vulnerable cancer patients. Thus, this becomes very challenging to treat patients separately from patients only with COVID19. There is a very minimal chance that cancer patients will not get infected by COVID19 in the long run. We have to run several clinical trials in the presence of COVID19 infection. Disease progression events occurred as locoregional relapse, progression, and death  the events marked as 1, 2 and 3, respectively. The events ordered, which implies that the locoregional relapse appeared earlier than progression or death, and death as a terminal event. Here, our interest was to measure the event occurrence rate at each of the interval or gap time between two events. Let T_{i, j} be the actual event time for i^{th} individual and j denoted different events by 1, 2 or 3. We considered that all the individuals had experienced at least one event. The intervals between two subsequent events were defined as follows:
In our study, the gap times were assumed to be dependent with ordered events. In order to the dependency structure, we concluded that the 1st event corresponds to G_{i, 1}, the duration from the beginning of the study to the occurrence of the second event, the second event correspond to G_{i, 2} and so on. So, the dependency structure was presented among G_{i, 1}, G_{i, 2} etc.
We assumed that a simple linear regression model between G_{i, 1} and G_{i, 2}. The regression model was.
We fit two separate linear regressions for two different arms. β_{1, 0} and defined the change in G_{i, 2} for a unit change in G_{i, 1} for arm 0. The same inference was drawn for β_{1, 1} So, ignoring the intercept term in the regression model, the difference between the coefficients β_{1, 1} β_{1, 0} stated the change in dependent gap time was due to change in the arm. We fit AFT models for G_{i, 1} and G_{i, 2} and obtained the corresponding coefficients of the arm to measure the change on events due to variation in treatment.
AFT model with gap time
The AFT model is a popular alternative of proportional hazard model to analyse survival data [8, 9]. It is also applicable in the current COVID19 scenario. It is more efficient to model the survival time rather than hazard rate; to observe the dependency pattern between observed times. In the AFT model, it assumed that the effect of the covariate is to accelerate or decelerate the survival duration by some constants. The AFT model is expressed as,
Here, G_{i} denotes the survival time for i^{th} individual, β is the unknown regression coefficient, μ is the intercept term, x_{i} is the covariate for i^{th} subject (i = 1, 2, …. n), ε_{i} is the error component, ε_{1}, ε_{2}, …, ε_{n} are independent and identically distributed as Normal (0,1). So, given covariates, the response times are independent. In our study, we consider the gap time to fit AFT models for different event occurrences. The gap times (G_{i}) between two consecutive events are model as response variables in eq. (3).
For the AFT model, the survival function is
We considered the Bayesian approach to estimate the parameter estimates for the AFT model obtained from the posterior distributions based on Markov Chain Monte Carlo (MCMC) simulation by Gibbs Sampling method. To conduct data analysis using Bayesian techniques, we need to specify the prior distributions of the parameters. We used independent Gaussian prior distributions with mean 0 and variance 0.001 for the parameter μ and other regression coefficient β. The models were compared, and the best fit model was decided based on the Akaike Information Criterion (AIC).
The better fit among candidate models performed through the Akaike information criterion (AIC) [10, 11] as
The number of parameters is represented by k. The random variable and maximum likelihood estimate were presented by x and \( \hat{\theta} \) where the parameter of interest was defined as θ. The minimal value of AIC shows a better fit of the model. The Bayesian extension of the Cox proportional hazard model was presented as
The term Y was the observed evidence, and the marginal probability of Y was defined as P(Y). The prior is P(θ) and the likelihood function was P(Y θ). Mean, standard deviation, credible interval and the highest posterior density (HPD) were computed for each parameter. An alternative of the AIC in the context of Bayesian model selection method was Deviance Information Criteria (DIC) [12, 13]. The Deviance Information Criteria (DIC) was defined as,
where
The DIC estimates the valid number of parameters by the difference of the posterior mean of the deviance and deviance of posterior means.
Bayesian CPH regression separately for each event
The Cox proportional hazards (Cox PH) model was applied in timetoevent data analysis [14,15,16]. It was defined as
or
For the i^{th} patient, the baseline hazard and hazard at time t were defined by λ_{0}(t) and λ_{i}(t Z_{i}), Z_{i} is the covariate for an i^{th} patient with the regression coefficient β. The hazard ratio was defined as a predicted hazard function under different predictor variables. The partial likelihood function was adopted to fit the Cox model. A high pvalue for the coefficient was defined as less significance of the variable of interest. The better fit among candidate models was performed through the Akaike information criterion (AIC) as discussed. Similarly, DIC was used for model comparison while using Bayesian techniques.
We considered different timetoevents in different CPH models with several factors like arm, age, and gender and obtained the posterior means of the parameters through the models provided in Table 1. The CPH was performed as a conventional choice to show time to event data analysis.
Results
Dataset was presented to resemble a motivating example of head and neck cancer (HNC). A total of 74 patients treated with two chemotherapeutic arms were illustrated. The clinical trial was aimed to perform the PFS between two types of therapy. The therapies were (I) ‘ArmA’ (n = 43 subjects) or (II) ‘ArmB’ (n = 31 subjects). The covariates considered were (a) Arm, (b) Age and (c) Gender. Subjects were followed continuously, and the occurrence of relapse, disease progression and death were monitored. Data with missing observations were not considered for analysis. The mimic data was uploaded as supplementary file S1.
We considered the duration between treatment initiation to the time of progression or the last followup visit for patients who had not progressed – the sequence was defined as RECIST criteria version 1.1. Diseasefree survival was considered as the duration while the person experienced complete remission. We found the period between LRC and progression as T_{1} and between progression and death as T_{2}.
One of the aims of the trial was to investigate the best active arm to prolong the PFS. The experiment was continued to explore the locoregional recurrence and overall survival. In this example, we measured the LRC as the duration between dates of registration to the time of first locoregional relapse. Similarly, the date of enrolment to date of progression was defined as PFS. The OS was defined as the last date of followup or date of death from the date of registration. The CPH hazard model and AFT model were considered for different states in the context of Bayesian frameworks. The states were defined as a dead state (state 3), living with the progressed disease (state 2) and living with locoregional recurrence, not with distant metastasis/progression (state1). The direct transition from state 1 to state 3 is possible. However, as mentioned earlier, we considered only those patients for which all three states were apparent.
The CPH model applied in this dataset was defined as,
The three covariates considered for the modelling were Arm, Age, and Gender. The results were illustrated in Table 1. The survival curves corresponding to LRC and PFS are shown in Fig. 1 and Fig. 2. The Kolmogorovtype supremum test was performed to obtain the pvalue.
The AFT models computed considering arm as the only covariate. The model was
The posterior mean and standard deviation of the Arms were obtained by the AFT and regression model. The density plots of the difference of Arm effect from both the models are shown in Fig. 3.
We can draw this inference that the dependency of gap times is translated through the regression structure. So, adding the arm effect from the AFT survival model for the first gap time and the arm effect was obtained from the regression model. Thus, given the information of time between LRC and PFS, and the dependency structure between gap times, the survival duration between PFS and OS was predicted. The results of the posterior means obtained using the Bayesian AFT model are given in Table 2.
Discussion
The novel coronavirus that causes COVID19 appeared more than twice as high among individuals with cancer than the general population [17]. In survival analysis of diseaserelated to oncology, the patients commonly experience multiple events like locoregional relapse, progression, death across the followup period. The interest lies in the prediction of survival duration for a particular event and evaluating effective treatments  the analysis carried by assuming the independence of the events. However, due to missing data on followup visits of the patients, information regarding the complete followups of the patient is often unknown. So, their survival duration cannot be predicted based on the analysis carried out on the previously occurred events. The dependent modelling of the durations between consecutive events will assist in predicting the occurrence of the next event. The generalised version of the multistate model is welldocumented [18, 19]. The purest form of the mortality model having two states are, ‘alive without disease’ and ‘dead’ and a linked transition between these two states. The competing risk model is defined as a provision where individuals may die due to other causes [20,21,22]. The widely accepted form of the multistate model is the illnessdeath model or disability model. The associated package to work in these directions is ‘mstate’ is useful for multistate regression and to get prediction probability. Another package ‘survdim’ is helpful to perform typespecific Cox models. The parametric multistate model showed through ‘msm’ and ‘flexsurv’. This work is performed with open source software OpenBugs to serve the Bayesian.
Conclusions
The constant news about the coronavirus pandemic is relentless and has a long list of terrifying characteristics, and it is frightening because they are unknown and unpredictable. In this situation of the outbreak, it is not possible to separate treatment for cancer patients due to COVID19. An effective treatment comparison strategy is required. We presented a handy tool to understand cancer progression in this unprecedented scenario. Linking different events of cancer progression is the need of the hour, and it is a methodological challenge. We provide the solutions to overcome the issue with intervals between two consecutive events by considering the example of head and neck cancer (HNC) data.
Now it is difficult to run a cancer clinical trial with COVID19. All ongoing cancer clinical trials now are either on hold or severely affected. It is not a temporary problem. It will put questions about COVID19 related death in all ongoing trials in the future. Unless we create a comprehensive analytical strategy to deal with COVDI19 associated mortality during the cancer clinical trial, we cannot find the best effective treatment outcomes obtained through cancer trials. We preferred not to consider LRC, PFS, and OS as separate entities to understand treatment success. Here, LRC and PFS entities are merged through their gap times and defined as event till PFS. The recommendation is to consider disease progression and transition into account rather than consider these events as separate entities to understand the best treatment effect.
Availability of data and materials
Not applicable.
Abbreviations
 AFT:

Accelerated failure time
 AIC:

Akaike information criterion
 COVID19:

Coronavirus disease
 DIC:

Deviance information criteria
 HNC:

Head and neck cancer
 HPD:

Highest posterior density
 LRC:

Locoregional control
 MCMC:

Markov chain Monte Carlo
 OD:

Overall death
 PFS:

Progressionfree survival
 SD:

Standard deviation
References
Wang H, Zhang L. Risk of COVID19 for patients with cancer. Lancet Oncol. 2020;21:e181.
Moujaess E, Kourie HR, Ghosn M. Cancer patients and research during COVID19 pandemic: a systematic review of current evidence. Crit Rev Oncol Hematol. 2020;150:102972.
Poortmans PM, Guarneri V, Cardoso MJ. Cancer and COVID19: what do we really know? Lancet. 2020;395:1884. https://doi.org/10.1016/S01406736(20)31240X.
Kaplan EL, Meier P. Nonparametric estimation from incomplete observations. J Am Stat Assoc. 1958;53:457–81.
Cox DR. Regression models and lifetables. J Roy Stat Soc. 1972;34:187–202.
Bhattacharjee A, Patil VM, Dikshit R, Prabhash K, Singh A, Chaturvedi P. Should we wait or not? The preferable option for patients with stage IV oral cancer in COVID −19 pandemic. Head Neck. 2020;42:1173–8. https://doi.org/10.1002/hed.26196.
Pothuri B, Alvarez Secord A, Armstrong DK, Chan J, Fader AN, Huh W, et al. Anticancer therapy and clinical trial considerations for gynecologic oncology patients during the COVID19 pandemic crisis. Gynecol Oncol. 2020;158:16. https://doi.org/10.1016/j.ygyno.2020.04.694.
Wei LJ. The accelerated failure time model: a useful alternative to the Cox regression model in survival analysis. Stat Med. 1992;11:1871–9.
Kalbfleisch JD, Prentice RL. The statistical analysis of failure time data. USA: Wiley; 2011.
Akaike H. A Bayesian extension of the minimum AIC procedure of autoregressive model fitting. Biometrika. 1979;66:237–42.
Akaike H, et al. Likelihood of a model and information criteria. J Econ. 1981;16(1):3–14.
Ando T. Bayesian model selection and statistical modelling. Florida: CRC Press; 2010.
Linde A. DIC in variable selection. Statistica Neerlandica. 2005;59:45–56.
George B, Seals S, Aban I. Survival analysis and regression models. J Nucl Cardiol. 2014;21:686–94.
Lin DY, Wei LJ, Ying Z. Checking the Cox model with cumulative sums of martingalebased residuals. Biometrika. 1993;80:557–72.
Kasza J, Wraith D, Lamb K, Wolfe R. Survival analysis of timetoevent data in respiratory health research studies. Respirology. 2014;19:483–92.
Wu C, Chen X, Cai Y, Xia J’A, Zhou X, Xu S, et al. Risk factors associated with acute respiratory distress syndrome and death in patients with coronavirus disease 2019 pneumonia in Wuhan, China. JAMA Intern Med. 2020;180:1. https://doi.org/10.1001/jamainternmed.2020.0994.
Andersen PK, Borgan O, Gill RD, Keiding N. Statistical models based on counting processes. New York: SpringerVerlag; 2012.
Hougaard P. Analysis of multivariate survival data. New York: SpringerVerlag; 2012.
Putter H, Spitoni C. Nonparametric estimation of transition probabilities in nonMarkov multistate models: the landmark Aalen–Johansen estimator. Stat Methods Med Res. 2018;27:2081–92.
Andersen PK, Abildstrom SZ, Rosthøj S. Competing risks as a multistate model. Stat Methods Med Res. 2002;11:203–15.
Bhattacharjee A. Bayesian competing risks model: an application to breast cancer clinical trial with incomplete observations. J Stat Manage Syst. 2015;18:381–404.
Acknowledgements
Authors are deeply indebted to the Guest Editor of BMC Medical Research Methodology (Methodologies for COVID19 research and data analysis) Professor Livia Puljak and two anonymous learned referees for their valuable suggestions leading to improving the quality of contents and presentation of the original manuscript. Authors are also thankful to Professor M. Masoom Ali, Department of Mathematical Sciences, Ball State University, Muncie, Indian, USA for editing the English language and improving the grammar of this manuscript.
Funding
Authors are thankful to the Science & Technology, Government of India, for providing necessary support to carry out the present research work through project No. MSC/2020/000063 but not for APC.
Author information
Authors and Affiliations
Contributions
AB planned the study, AB and SB performed the study, GKV prepared the manuscript. SS written the methodological details to finalise the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that there are no competing and conflict of interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Bhattacharjee, A., Vishwakarma, G.K., Banerjee, S. et al. Disease progression of cancer patients during COVID19 pandemic: a comprehensive analytical strategy by timedependent modelling. BMC Med Res Methodol 20, 209 (2020). https://doi.org/10.1186/s1287402001090z
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s1287402001090z
Keywords
 COVID19
 Accelerated failure time
 Proportional Hazard model
 Bayesian
 Autoregression