Disease progression of cancer patients during COVID-19 pandemic: a comprehensive analytical strategy by time-dependent modelling

Background As the whole world is experiencing the cascading effect of a new pandemic, almost every aspect of modern life has been disrupted. Because of health emergencies during this period, widespread fear has resulted in compromised patient safety, especially for patients with cancer. It is very challenging to treat such cancer patients because of the complexity of providing care and treatment, along with COVID-19. Hence, an effective treatment comparison strategy is needed. We need to have a handy tool to understand cancer progression in this unprecedented scenario. Linking different events of cancer progression is the need of the hour. It is a huge challenge for the development of new methodology. Methods This article explores the time lag effect and makes a statistical inference about the best experimental arm using Accelerated Failure Time (AFT) model and regression methods. The work is presented as the occurrence of other events as a hazard rate after the first event (relapse). The time lag effect between the events is linked and analysed. Results The results were presented as a comprehensive analytical strategy by joining all disease progression. An AFT model applied with the transition states, and the dependency structure between the gap times was used by the auto-regression model. The effects of arms were compared using the coefficient of auto-regression and accelerated failure time (AFT) models. Conclusions We provide the solutions to overcome the issue with intervals between two consecutive events in motivating head and neck cancer (HNC) data. COVID-19 is not going to leave us soon. We have to conduct several cancer clinical trials in the presence of COVID-19. A comprehensive analytical strategy to analyse cancer clinical trial data during COVID-19 pandemic is presented.


Background
Cancer patients are more prone to develop COVID-19 because they are immunocompromised [1]. Studies have suggested that cancer patients are more susceptible to Coronavirus, whereas individuals without cancer are immunosuppressed. Though the risk of COVID-19 infection varies individually, cancer patients require continuous care and treatment intervention and potential risk of COVID-19 exposure could be fatal [2]. Studies have shown that COVID-19 has created a great challenge to manage the cancer care delivery system [3].
It is essential to assess the patient's risk of both COVID-19 and tumour control on a case-by-case basis with the patient. Conventionally, the treatment effect of head and neck cancer (HNC) is explored by multiple events like loco-regional control (LRC), progression-free survival (PFS), and overall survival (OS). These events are analysed separately by  and the Cox Proportional Hazard (CPH) models [5]. Currently, it is difficult to isolate the reason for death due to Coronavirus or disease progression among cancer patients [6]. Similarly, all ongoing cancer clinical trials cannot stop due to COVID-19 in the long run, and it is challenging to conduct cancer clinical trials [7] in this present environment. Thus, time lag/intervals between different types of events are essential to explore.
In this manuscript, we focused on exploring the time lag effect and studied the statistical inference about the best experimental arm using Accelerated Failure Time (AFT) Model and regression methods. We present our work here for the occurrence of other events as a hazard rate after the first event (relapse). It is known that local relapse biologically triggers cancer progression and death; however, in this study, we have not considered it. As most of the events are likely to be influenced by COVID-19 infection, so it required to establish an integrated analysis.
The relapse triggers disease progression, and further, disease progression accelerates death rate. The study considered two-time points generated as the duration between relapse to progression and duration between progress to death. For these transition periods, we used the CPH and AFT model, which are useful to work on transition states where treatment effect is comparable.
In this study, the statistical model was considered to handle both the previously mentioned time points and explore the relations between gap durations. Further, we applied a CPH model to understand the different types of transition hazard models and the time-varying covariates considered separately. The results presented as a comprehensive analytical strategy. An AFT model applied with the transition states, and we explained the dependency structure between the gap times using auto-regression. The effects of arms compared using the coefficient of auto-regression and AFT models -the complete analysis using Bayesian techniques executed with R open-source software and OpenBUGS.

Dependency modelling
It is difficult to reduce risk and prevent the spread of the COVID-19 virus among vulnerable cancer patients. At the same time, we have to provide treatment to all these several thousand vulnerable cancer patients. Thus, this becomes very challenging to treat patients separately from patients only with COVID-19. There is a very minimal chance that cancer patients will not get infected by COVID-19 in the long run. We have to run several clinical trials in the presence of COVID-19 infection. Disease progression events occurred as loco-regional relapse, progression, and death -the events marked as 1, 2 and 3, respectively. The events ordered, which implies that the loco-regional relapse appeared earlier than progression or death, and death as a terminal event. Here, our interest was to measure the event occurrence rate at each of the interval or gap time between two events. Let T i, j be the actual event time for i th individual and j denoted different events by 1, 2 or 3. We considered that all the individuals had experienced at least one event. The intervals between two subsequent events were defined as follows: G i;1 ¼ T i;1 and G i; j ¼ T i; j − T i; j − 1 for i ¼ 1; 2; … :; n; j ¼ 1; 2 and T i;0 ¼ 0: In our study, the gap times were assumed to be dependent with ordered events. In order to the dependency structure, we concluded that the 1st event corresponds to G i, 1 , the duration from the beginning of the study to the occurrence of the second event, the second event correspond to G i, 2 and so on. So, the dependency structure was presented among G i, 1 , G i, 2 etc.
We assumed that a simple linear regression model between G i, 1 and G i, 2 . The regression model was.
We fit two separate linear regressions for two different arms. β 1, 0 and defined the change in G i, 2 for a unit change in G i, 1 for arm 0. The same inference was drawn for β 1, 1 So, ignoring the intercept term in the regression model, the difference between the coefficients β 1, 1 -β 1, 0 stated the change in dependent gap time was due to change in the arm. We fit AFT models for G i, 1 and G i, 2 and obtained the corresponding coefficients of the arm to measure the change on events due to variation in treatment.

AFT model with gap time
The AFT model is a popular alternative of proportional hazard model to analyse survival data [8,9]. It is also applicable in the current COVID-19 scenario. It is more efficient to model the survival time rather than hazard rate; to observe the dependency pattern between observed times. In the AFT model, it assumed that the effect of the covariate is to accelerate or decelerate the survival duration by some constants. The AFT model is expressed as, Here, G i denotes the survival time for i th individual, β is the unknown regression coefficient, μ is the intercept term, x i is the covariate for i th subject (i = 1, 2, …. n), ε i is the error component, ε 1 , ε 2 , …, ε n are independent and identically distributed as Normal (0,1). So, given covariates, the response times are independent. In our study, we consider the gap time to fit AFT models for different event occurrences. The gap times (G i ) between two consecutive events are model as response variables in eq. (3).
For the AFT model, the survival function is We considered the Bayesian approach to estimate the parameter estimates for the AFT model obtained from the posterior distributions based on Markov Chain Monte Carlo (MCMC) simulation by Gibbs Sampling method. To conduct data analysis using Bayesian techniques, we need to specify the prior distributions of the parameters. We used independent Gaussian prior distributions with mean 0 and variance 0.001 for the parameter μ and other regression coefficient β. The models were compared, and the best fit model was decided based on the Akaike Information Criterion (AIC).
The better fit among candidate models performed through the Akaike information criterion (AIC) [10,11] as The number of parameters is represented by k. The random variable and maximum likelihood estimate were presented by x andθ where the parameter of interest was defined as θ. The minimal value of AIC shows a better fit of the model. The Bayesian extension of the Cox proportional hazard model was presented as The term Y was the observed evidence, and the marginal probability of Y was defined as P(Y). The prior is P(θ) and the likelihood function was P(Y| θ). Mean, standard deviation, credible interval and the highest posterior density (HPD) were computed for each parameter. An alternative of the AIC in the context of Bayesian model selection method was Deviance Information Criteria (DIC) [12,13]. The Deviance Information Criteria (DIC) was defined as, where The DIC estimates the valid number of parameters by the difference of the posterior mean of the deviance and deviance of posterior means.

Bayesian CPH regression separately for each event
The Cox proportional hazards (Cox PH) model was applied in time-to-event data analysis [14][15][16]. It was defined as For the i th patient, the baseline hazard and hazard at time t were defined by λ 0 (t) and λ i (t| Z i ), Z i is the covariate for an i th patient with the regression coefficient β. The hazard ratio was defined as a predicted hazard function under different predictor variables. The partial likelihood function was adopted to fit the Cox model. A high p-value for the coefficient was defined as less significance of the variable of interest. The better fit among candidate models was performed through the Akaike information criterion (AIC) as discussed. Similarly, DIC was used for model comparison while using Bayesian techniques.
We considered different time-to-events in different CPH models with several factors like arm, age, and gender and obtained the posterior means of the parameters through the models provided in Table 1. The CPH was performed as a conventional choice to show time to event data analysis.

Results
Dataset was presented to resemble a motivating example of head and neck cancer (HNC). A total of 74 patients treated with two chemotherapeutic arms were illustrated. The clinical trial was aimed to perform the PFS between two types of therapy. The therapies were (I) 'Arm-A' (n = 43 subjects) or (II) 'Arm-B' (n = 31 subjects). The covariates considered were (a) Arm, (b) Age and (c) Gender. Subjects were followed continuously, and the occurrence of relapse, disease progression and death were monitored. Data with missing observations were not considered for analysis. The mimic data was uploaded as supplementary file S1.
We considered the duration between treatment initiation to the time of progression or the last follow-up visit for patients who had not progressedthe sequence was defined as RECIST criteria version 1.1. Disease-free survival was considered as the duration while the person experienced complete remission. We found the period between LRC and progression as T 1 and between progression and death as T 2 .
One of the aims of the trial was to investigate the best active arm to prolong the PFS. The experiment was continued to explore the loco-regional recurrence and overall survival. In this example, we measured the LRC as the duration between dates of registration to the time of first loco-regional relapse. Similarly, the date of enrolment to date of progression was defined as PFS. The OS was defined as the last date of follow-up or date of death from the date of registration. The CPH hazard model and AFT model were considered for different states in the context of Bayesian frameworks. The states were defined as a dead state (state 3), living with the progressed disease (state 2) and living with loco-regional recurrence, not with distant metastasis/progression (state1). The direct transition from state 1 to state 3 is possible. However, as mentioned earlier, we considered only those patients for which all three states were apparent.
The CPH model applied in this dataset was defined as, The three covariates considered for the modelling were Arm, Age, and Gender. The results were illustrated in Table 1. The survival curves corresponding to LRC  and PFS are shown in Fig. 1 and Fig. 2. The Kolmogorov-type supremum test was performed to obtain the p-value.
The AFT models computed considering arm as the only covariate. The model was The posterior mean and standard deviation of the Arms were obtained by the AFT and regression model. The density plots of the difference of Arm effect from both the models are shown in Fig. 3.
We can draw this inference that the dependency of gap times is translated through the regression structure. So, adding the arm effect from the AFT survival model for the first gap time and the arm effect was obtained from the regression model. Thus, given the information of time between LRC and PFS, and the dependency structure between gap times, the survival duration between PFS and OS was predicted. The results of the posterior means obtained using the Bayesian AFT model are given in Table 2.

Discussion
The novel coronavirus that causes COVID-19 appeared more than twice as high among individuals with cancer than the general population [17]. In survival analysis of disease-related to oncology, the patients commonly experience multiple events like loco-regional relapse, progression, death across the follow-up period. The interest lies in the prediction of survival duration for a particular event and evaluating effective treatments -the analysis carried by assuming the independence of the events. However, due to missing data on follow-up visits of the patients, information regarding the complete follow-ups of the patient is often unknown. So, their survival duration cannot be predicted based on the analysis carried out on the previously occurred events. The dependent modelling of the durations between consecutive events will assist in predicting the occurrence of the next event. The generalised version of the multi-state model is well-documented [18,19]. The purest form of the mortality model having two states are, 'alive without disease' and 'dead' and a linked transition between these two states. The competing risk model is defined as a provision where individuals may die due to other causes [20][21][22]. The widely accepted form of the multi-state model is the illness-death model or disability model. The associated package to work in these directions is 'mstate' is useful for multi-state regression and to get prediction probability. Another package 'survdim' is helpful to perform type-specific Cox models. The parametric multi-state model showed through 'msm' and 'flexsurv'. This work is performed with open source software OpenBugs to serve the Bayesian.

Conclusions
The constant news about the coronavirus pandemic is relentless and has a long list of terrifying characteristics, and it is frightening because they are unknown and unpredictable. In this situation of the outbreak, it is not possible to separate treatment for cancer patients due to COVID-19. An effective treatment comparison strategy is required. We presented a handy tool to understand cancer progression in this unprecedented scenario. Linking different events of cancer progression is the need of the hour, and it is a methodological challenge. We provide the solutions to overcome the issue with intervals between two consecutive events by considering the example of head and neck cancer (HNC) data. Now it is difficult to run a cancer clinical trial with COVID-19. All ongoing cancer clinical trials now are either on hold or severely affected. It is not a temporary problem. It will put questions about COVID-19 related death in all ongoing trials in the future. Unless we create a comprehensive analytical strategy to deal with COVDI-19 associated mortality during the cancer clinical trial, we cannot find the best effective treatment outcomes obtained through cancer trials. We preferred not to consider LRC, PFS, and OS as separate entities to understand treatment success. Here, LRC and PFS entities are merged through their gap times and defined as event till PFS. The recommendation is to consider disease progression and transition into account rather than consider these events as separate entities to understand the best treatment effect.