Inference about time-dependent prognostic accuracy measures in the presence of competing risks

Background Evaluating a candidate marker or developing a model for predicting risk of future conditions is one of the major goals in medicine. However, model development and assessment for a time-to-event outcome may be complicated in the presence of competing risks. In this manuscript, we propose a local and a global estimators of cause-specific AUC for right-censored survival times in the presence of competing risks. Methods The local estimator - cause-specific weighted mean rank (cWMR) - is a local average of time-specific observed cause-specific AUCs within a neighborhood of given time t. The global estimator - cause-specific fractional polynomials (cFPL) - is based on modelling the cause-specific AUC as a function of t through fractional polynomials. Results We investigated the performance of the proposed cWMR and cFPL estimators through simulation studies and real-life data analysis. The estimators perform well in small samples, have minimal bias and appropriate coverage. Conclusions The local estimator cWMR and the global estimator cFPL will provide computationally efficient options for assessing the prognostic accuracy of markers for time-to-event outcome in the presence of competing risks in many practical settings.

from the side effects of biopsy and from the unnecessary costs. Before one adopts such markers into clinical practice, it is crucial to evaluate their prognostic accuracy i.e. whether markers correctly discriminate subjects who will subsequently experience the event of interest at or by time t from those who will not experience any event by t. The goal of this paper is to develop a method to estimate time-dependent prognostic accuracy measure of such marker.
The evaluation of prognostic accuracy of marker becomes complicated in the presence of competing risks. Competing risks arise when a subject experiences a terminal event due to one of the multiple mutually exclusive causes. For example, post-liver transplant, patients may die due to adverse liver and/or transplant-related outcomes (e.g. graft failure) while other patients may die due to competing events (e.g. non-liver causes) before experiencing liver-related adverse outcomes. This leads to a competing risks situation because liver-related marker could predict graft failure or graft related death, but may not predict non-graft related events. [5]. Here, the scientific question is how well do FIB-4 or NAFLD discriminate between patients who progress to graft-related death and those who do not. Understanding whether FIB-4 or NAFLD is highly predictive for death due to graft failure but not others could potentially lead to more rational and cost-effective use of specific medications or treatment strategies. In order to facilitate the assessment of prognostic accuracy of marker, the goal of this manuscript is to develop methods to estimate time-dependent prognostic accuracy of a baseline marker after taking right-censoring and competing risks into account.
A number of statistical measures have been proposed to assess the prognostic accuracy of a marker. The Receiver operating characteristic (ROC) curve and the area under the ROC curve (AUC) [6] are the most popular measures of a binary classifier system. ROC curve is a graphical illustration of the diagnostic ability of a binary classifier system as its discrimination threshold is varied and AUC provides a global summary of the discriminatory capacity of the marker. For a survival outcome, the event status of a subject can change over time and the risk of developing the event conditional on marker value changes over the follow-up time. Therefore, the accuracy summaries for evaluating the performance of a marker must take this time-dependence into account. Time-dependent ROC methods for survival time [7], [8] classify the subjects as cases or controls depending on their survival status at or by time t and compare their observed status with a predicted risk at some or all times. The incident cases/dynamic controls classification arises [8] when scientific interest focuses on correct classification of subjects who are still in risk at time t. The incident (I) cases are those subjects who had an event at t and the dynamic (D) controls are those who survived through t. Among the other definitions of time-dependent ROC, cumulative (C) case and dynamic control definition pair is most common. In this classification, cumulative cases are defined as patients having an event within a certain time range, say [ 0, t]. The I/D version of AUC focuses on incident cases and is better suited for characterizing the trajectory of AUC over time, while C/D AUC does not characterize the evolution of accuracy over time. Another appealing property of I/D measure is that the evaluation of the model at a certain time point t only focuses of the riskset at the time t and therefore prior events and performance of the marker does not influence or distort the marker accuracy at t. As prognostic models aim to predict future, this property is very appealing when evaluating dynamic prognostic models.
In this article, we focus on I/D definition of AUC(t), which is the time-dependent area under the I/D ROC curve at t and was introduced in [8]. There has been extensive research in estimating the I/D AUC(t) in the case of single cause of failure. In [9], a locally weighted mean rank (WMR) smoothing based on the intermediate concordance measure was proposed. A fractional polynomials (FPL) estimator based on modeling AUC(t) as a function of time was proposed in [10]. To evaluate the methods, the authors in [11] compared different approaches (i.e. [8], [12], [6]) of estimating the concordance index based on the AUC(t) by simulation studies. However, the estimation of the AUC(t) at different follow-up time points was not investigated in that study.
Though competing risks is an important issue in many practical clinical settings, there is limited research on estimating the I/D AUC(t) in the presence of competing risks. When there is only a single cause of failure, the I/D definition stratifies only the subjects in the riskset. In the presence of competing risks, I/D definition further stratifies the subjects who are still at risk at time t into a single control group and cause-specific case groups depending on the cause of failure. Estimation of time-dependent measures under competing risks were discussed in [13] and [14]. In [13] a time-dependent I/D ROC curve was estimated using a Cox model for the cause-specific hazards and riskset reweighting of the marker distribution. This approach is semi-parametric, indirect, and computationally intensive. First, it requires correct specification of a conditional hazard regression model linking the marker to the event time. This approach provides biased estimate when the monotonicity of association between marker and event time is violated. Second, in order to obtain the AUC curve from the ROC, numerical integration of the ROC curve is required. Furthermore, within an interval around each unique event time, a Cox model is assumed and the parameters of the sequence of Cox models are estimated for each neighborhood around the unique event times. Therefore, their approach is also time-consuming. The goal of this manuscript is to propose non-parametric, direct, intuitive and scalable methods to estimate the I/D cause-specific AUC(t) of baseline marker, accounting for censoring and competing events. The major advantage of our proposed method over the existing semi-parametric estimator will be that it requires no specification of a conditional hazard model linking the marker to the event time and hence robust to model misspecification. In addition, the inference of this measure can be developed under minimal assumptions.
The rest of the article is organized as follows. In "Nota tion" section, we provide the notation. In "Weighted mean rank estimator of i/D cause-Specific aUC(t) (cWMR)" and "Fractional polynomial estimator of i/D cause-Spe cific aUC(t) (cFPL)" sections, we introduce a local and a global estimators of cause-specific I/D AUC(t) that are direct, flexible and non-parametric. We report simulation results to illustrate our methodology in "Simulation study" section. We report a real data example in 'Applica tion" section, and put a discussion in "Discussion" section. We end with conclusion in "Conclusions" section.

Notation
Let M denote the (baseline) marker that could potentially be used in predicting the survival time in the presence of competing risks where a subject can fail due to J mutually exclusive causes. Note that M can be a single covariate or it could be a risk score that may be calculated from a survival model (e.g. proportional hazard model). Let the implicit event times for each of the J causes be {T (1) , . . . , T (J) }. In the presence of competing risks, only the time to the occurrence of the first event is potentially observable. Thus, without censoring, one observes T = min (T (1) , . . . , T (J) ) and the failure indicator δ, which takes value j if T = T (j) (j = 1, . . . J). Define the observed event time, Z = min(T, C) where C is the censoring time. A censored observation has Z = C and this is recorded by δ = 0. Suppose that there are n subjects in the study. Let 1(.) denote the indicator function. Let R i (t) = 1{Z i ≥ t} denote the at-risk indicator for the i-th individual at time t (i = 1, 2, ..., n). Let R t = {i : R i (t) = 1} denote the subjects that are in the riskset at time t. Among the subjects in R t , the subjects who had an event from cause j at t are the j-th cause-specific incident (I) cases: R The subjects who did not have an event by t are the dynamic (D) controls: R 0t = {i : T i > t}. Let n t be the size of R 0t i.e. n t = |R 0t | and d (j) t be the size of R If there is a single cause of failure (J = 1), the timedependent incident/dynamic area under the ROC curve at time t, I/D AUC(t), is defined as ( 1 ) This is the time-dependent probability at time t that for a randomly selected case-control pair (i, k) the marker value for the incident case is higher than the marker value for the control.

Time-dependent i/D cause-Specific aUC(t)
In the presence of competing risks, the I case and D control definition can be extended as cause-specific incident cases and dynamic controls as follows: • j -th cause-specific case: T = t, δ = j; j = 1, 2, . . . , J.
The I/D AUC(t) in (1) can be redefined as the j-th I/D cause-specific AUC(t) and define as where j = 1, 2, . . . , J. Below we propose a local and a global estimators for the time-dependent cause-specific AUC(t) curve.

Weighted mean rank estimator of i/D cause-Specific aUC(t) (cWMR)
A non-parametric estimator of I/D AUC(t) was proposed in [9] using a nearest neighbor method. This estimator is a local average of time-specific observed AUCs. While the method do not address the issue of more than one cause of failure, natural modification of the approach allows estimation of accuracy in the presence of multiple causes of failure. To illustrate our proposed method, let A (j) (t) denote the proportion of (i, k) pairs where subject k has a lower marker value compared with that of subject i who experienced failure due to cause j, provided subject k has longer survival than subject i and define as Note that, A (j) (t) can be considered as an estimator of AUC (j) t in (2). However, typically failure time is measured in continuous scale and it is reasonable to assume that at a given time the likelihood of failures due to multiple causes is negligible. Therefore, there are only few cases experiencing the event of interest (e.g. j-th cause) at t and often d t} is the rank of the j-th cause-specific case marker value among the control markers at time t. In extreme situations, this quantity could jump from 1 to 0 and back between adjacent time points. Hence, the estimation of AUC (j) t based on A (j) (t) requires some degree of smoothing. In this situation, the information within a neighborhood around t, N j} can be used to estimate marker concordance at t. Let cWMR (j) t be an j-th cause-specific weighted mean rank estimator of AUC (j) t and define as cWMR where |N The optimal bandwidth (h n ) balances bias and variance of cWMR Therefore, to select the bandwidth, we followed the leaveone-out cross validation approach of [9] and adapted it to account for competing risks. In addition, we also propose a variance estimator for cWMR (j) t based on the assumption of bivariate Normality of the case and control marker pairs [9]. An additional file shows this in more detail [see Additional file 1].

t :
In order to evaluate the asymptotic behaviours of cWMR (j) t , we restrict our attention to a neighborhood around t and cause j i.e. N (j) t . Other causes can be treated in a similar way. Let B t denote the number of subjects at the start of the neighborhood i.e. at time t − h n .
Here, b n (t) denotes the bias and the variance is The proof of this theorem is provided in Additional file 1.

Estimation of variance of cWMR
The variance of cWMR In order to compute these components, we propose the following estimators in the spirit of variance calculation of AUC proposed in [15] and [9] var We estimate P 1 (.) etc. using a Normal approximation for the j-th cause-specific case and control markers after a rank-based Z-score transformation and then empirically estimating the parameters of the approximating normal distributions. The detail is provided in Additional file 1.

Fractional polynomial estimator of i/D cause-Specific aUC(t) (cFPL)
The authors in [10] proposed a method for modelling I/D AUC(t) defined in equation (1) in the case of single event type. Their method directly models AUC(t) as a function of the event time t through a flexible fractional polynomials model proposed in [16]. We have extended it in the presence of competing risks as follows.
The AUC (j) t after transformation with link function η can be specified as a parametric function of t using fractional polynomials of degree L: with p 1 ≤ p 2 ≤ · · · ≤ p L real-valued powers. As suggested in [10], we consider the power set p 1 , . . . , p L in {-2,-1,-0.5,0,0.5,1,2}, which is flexible enough to accommodate most applications. The set of regression parameters β j = (β j0 , β j1 , . . . , β j7 ) is then estimated by optimizing a likelihood function. A logit function considered for η(.) similar to [10]. Let there are h j number of failures due to cause j in the study. For each event time {t (H) & δ H = j}, there are two types of random variables where n 1 (t (H) ) and n 2 (t (H) ) are the number of concordant and number of discordant pairs, respectively. Note that, conditional on riskset R 0t (H) , the count n 1 (t (H) ) follows a Binomial distribution with probability AUC Maximizing this partial-likelihood yields the parameter estimateβ j of β j . Then, by using (4), we obtain AUC (j) t (β j ) estimate as a smooth function of time t andβ j . For estimation, we evaluate the score equations that correspond to the proposed likelihood. The proposed likelihood is constructed in the spirit of the [10]. However, the main difference is that, unlike [10], we have multiple causes of failure. We can write the log-likelihood in the following way We estimateβ j of β j as a solution of score vector Again, for l = 1, 2, .., 7, we get In order to make inferences about the proposed estimators of the cause-specific parameters, the major challenge lies in the fact that the proposed partial-likelihood cannot be treated as a regular likelihood function. Specifically, the asymptotic variance of the estimators is not the inverse of the negative second derivative of the partial-likelihood. We propose a sandwich variance estimator for the proposed global cause-specific AUC(t) estimator below.

Asymptotic properties of cFPL estimator
In this section, we describe the asymptotic properties of the model parameter estimators. We state some regularity conditions in Additional file 1. We summarize the asymptotic behavior of the regression parameter estimator in the following theorem.
Theorem 2 Under the regularity conditions,β j converges almost surely to β 0j , while √ n(β j − β 0j ) converges to a multivariate Normal distribution with mean vector 0 and covariance matrix −1 i (τ ) counts number of events due to j-th cause of failure occurring over [ 0, τ ] . Theorem 2 can be proven in the spirit of the proof in [10]. However, the main difference is that, unlike [10], our likelihood construction account multiple causes of failure. An additional file shows the proof in more detail [see Additional file 1]. The covariance can be consistently estimated byˆ −1

Simulation study
Extensive simulation studies are conducted in order to compare the performance of the cWMR, cFPL and semiparametric [13] estimators for estimating AUC (j) t . We assume two causes of failure (i.e. j = 1, 2) and a baseline marker M that is correlated with event time of cause 1, T (1) but not with event time of cause 2, T (2) . We consider several parametric combinations under two major scenarios. For each setting, we generate 500 dataset with a sample size of n = 500 and for each simulated dataset 200 bootstrap simulations are performed. For each simulation, we estimate AUC   (1) ) and μ 2 and σ 2 are mean and SD of M. We show the results for μ 1 = 0, μ 2 = 0, σ 1 = 1, σ 2 = 1, and ρ = −0.7. We consider a negative correlation between the marker and the event time which implies that higher marker value is more indicative of poor survival outcome and hence it is indicative of shorter event time. We further assume log(T (2) ) ∼ N(0, 1) and log(C) ∼ N(0, 1), such that approximately 20% subjects are censored. Since T (2) and M are independent, the I/D ROC curve for the competing cause of failure (i.e. cause 2) lies diagonally on the null ROC curve. (1) is non-monotone, while M and T (2) are independent. The heterogeneous population comprised two distinct subgroups (G = 0 or 1) and for G = 1, M is also independent of log(T (1) ). The distribution of (log(T (1) ), M) follows a mixture of two BVNs. We show the results for two different parameter combinations:

Scenario 2 We focus on a heterogeneous population where the marginal relationship between M and T
We assume G ∼ Bernoulli(0.2). Note that the semiparametric approach is biased because of violation of monotonicity in this scenario, and therefore not estimated. The resulting AUC (j) t curves mimic the relationship in the LT data as will be demonstrated later.
In Tables 1, 2, and 3, we summarize the simulation results for scenarios 1-2 respectively. Table 1 demonstrates that the estimates of AUC (j) t is less biased when derived using cWMR and cFPL compared with the semiparametric method. For instance, for cause 1, the ARB in the estimate of AUC (j) t for the cWMR is 0.36% corresponding to predicted time log(t) = -1.5. However, for the semi-parametric approach this is 2.16%. For large predicted time (i.e. log(t) = 0.9) both the cFPL and the semi-parametric estimates show large ARB. The bootstarp standard errors for both cWMR and cFPL methods are close to their corresponding model-based standard error estimates. For cWMR, the coverage probabilities of estimated AUC (j) t based on 90% estimated confidence intervals are very close to the nominal value of 0.9 across all predicted times. When we have sufficient data around the given predicted times, say -1.5 ≤ log(t) ≤ 0.6, the coverage probabilities of estimated AUC (j) t using cFPL are very close to the nominal value. However, the coverage probability is much lower than the nominal level when riskset size is 4 at given time log(t) = 0.9. This is perhaps the issue of oversmoothing. It could be avoided by choosing the large predicted times as the 90th percentile of the observed survival time points [10]. In Tables 2 and  3, we compare the cWMR and cFPL estimates as the semiparametric estimator is known to be highly biased under scenario 2. Note that, in Table 2, the AUC value for cause 1 increases steadily between −1.5 ≤ log(t) ≤ 0 and then start decreasing. This kind of non-monotone pattern may be due to violation of monotone relationship between marker and event time. The estimates of AUC (j) t for both cWMR and cFPL are close to their corresponding true values when the predicted time is small. In these comparisons cFPL performs slightly better than cWMR. The cFPL method yields substantially greater variances for large values of log(t) compared to the cWMR. For both methods, the estimated coverage probabilities are very close to the nominal coverage probability of 90% except for edges.
Overall, our results demonstrate that both cWMR and cFPL appear to perform well compared to semiparametric approach in terms of bias and standard errors. In addition, in scenario 2 where the monotonicity between marker and event time is violated, the semi-parametric approach is known to be biased. However, both our proposed methods perform adequately well.

Application
We demonstrate the proposed methods for estimating AUC (j) t using LT data from a retrospective study conducted at the McGill University Health Center [3]. The LT study included 547 patients who underwent LT between 1991 and 2012 and who met the criteria: patient with graft survival >12 months; serum fibrosis biomarkers including FIB-4 and NAFLD score available at 1 year after LT; and a minimum follow-up of 1 year. The study found that serum fibrosis markers performed well in predicting death and graft loss in LT recipients. According to the authors, this is the first study to establish the prognostic  value of the fibrosis markers in a large cohort of LT recipients over a long-term follow-up period. We further analyzed a subset of the subjects (n = 423) after excluding subjects with missing outcome and/or marker values. During the study period, 64 patients who underwent LT died due to graft-related causes (e.g. graft failure). However, 62 patients died of causes that are unrelated to their transplantation (e.g. sepsis, cardiovascular disease, renal, respiratory failure etc.). Different causes of death led to a competing risks situation. The research objective is to evaluate the performance of FIB-4 as a marker to discriminate between subjects who died due to graft related causes and those who died of non-graft related causes after the LT. The top two subplots in Figure 1 show the estimated AUC obtained using cWMR and cFPL methods for graft-related, non-graft related and all-cause death (considering both graft and non-graft related death as events of interest). Irrespective of the methods, the AUC curve for all-cause mortality is biased compared to the curves for graft and no-graft related deaths. The magnitude of bias is downward compared to graft related deaths and the bias is upward compared to non-graft related deaths. It indicates that consideration of all-cause mortality as an event of interest instead of competing risks will result in biased accuracy estimates. Therefore, it leads us to analyse the LT data using the methodology proposed here in the presence of competing risks. Figure 1 also illustrates estimated AUC (j) t curves with 95% CI for graft and non-graft related causes using cWMR and cFPL methods. In order to estimate AUC     t under cFPL approach ranges between 0.89 to 0.56. This implies that on any day, t, during the first 3 years of follow-up, the probability that a subject after LT who dies due to graft related causes on day t having a FIB-4 value greater than a subject who survives beyond day t is at least 0.56. Overall, AUC (j) t curve estimated using cFPL is a smooth function over predicted time while estimated AUC (j) t curve of cWMR is less smooth. We could not find any definitive reasons for the cause-specific AUC(t) for graft related cause increases between years 3.3 and 8. However, in reference to results from simulation scenario 2, we observed that if there is any latent (or unobserved) heterogeneity in the data, the AUC curve shows nonmonotone trend over time. On the other hand, these two curves for non-graft related death are almost flat around the horizontal line at AUC(t) = 0.5. Furthermore, 95% CI's of the estimates of AUC contain the null value of 0.5 which implies that FIB-4 is non-informative as a prognostic marker for non-graft related events. Therefore, FIB-4 as a baseline marker does not discriminate patients with non-graft related deaths after LT, which is expected. In addition, under cFPL method the CIs of the estimates of AUC (j) t curves at the tails of the study period are relatively wider/narrower than that of under cWMR method. The CIs of the estimates using cFPL are wider in both small and large predicted times because cFPL may have oversmoothing issue especially towards the start and end of the study. Finally, our analysis indicates a better discrimination by FIB-4 for graft related deaths than non-graft related deaths after LT for most of the study duration.

Discussion
Measures of calibration and discrimination are integral parts to evaluate prognostic accuracy of a marker. Calibration indices provide information on how close the predicted risks are to the observed risks while discrimination indices measure whether markers correctly discriminate subjects who will subsequently experience the event of interest at or by time t from those who will not experience any event by t. Calibration measures e.g. an expected Brier score for competing risks and corresponding estimator were provided in [17]. It measures the closeness of the observed event status and the model predicted event probabilities in the presence of competing risks. Here, we primarily focus on the discrimination accuracy of marker. The main goal of this manuscript is to estimate cause-specific AUC(t) of a baseline marker in the presence of competing risks. During such estimation, analysts often censor subjects when a competing event occurs. For instance, the outcome in LT study is timeto-death attributable to LT-related causes, an analyst may consider a subject as censored once that subject dies of causes unrelated to LT. Because subjects who died of non LT-related causes are not at risk of dying due to LT, censoring these competing risks events (informative censoring) may lead to distorted risk estimates [5] and subsequently biased accuracy estimators. Alternatively, some may consider a composite event where deaths attributable to LT and non-LT deaths are merged together as any adverse events. In the "Application" section, we mimic this to our LT data to show the drawback of using a composite endpoint to demonstrate the importance of considering competing risks in prognostic accuracy estimation. It is evident from our results that simply considering a composite event instead of competing events introduces bias in accuracy estimation. For competing risks analysis, the influence of covariate can be evaluated in relation to cause-specific hazard or the sub-distribution hazard of different causes of failure. For estimating cause-specific hazard, when a subject experience any event, they are removed from the subsequent risksets. In contrast, for estimating sub-distribution hazard [18], a subject who experiences a competing event is not removed from the riskset at that time, but rather is censored at the end of the follow-up.
We propose a local and a global estimators of causespecific AUC(t) for right-censored survival time outcome in the presence of competing risks. In [13], a semiparametric approach based on Cox model was suggested for estimating cause-specific AUC(t). This approach provides bias estimate when the association between event time and marker is non-monotone. Also, this is not a onestep approach for estimating AUC trajectory over time and it requires longer computation time. In addition, their method lacks analytical development of the large sample properties for statistical inference. These motivate us to propose new estimators for estimating cause-specific AUC(t). As pointed out earlier, the observed proportion of controls ranked lower than the (cause-specific) case, generally leads to unstable estimates because it is based on a single 'case' subject who had an event of interest at the specific time. Hence, the estimation of cause-specific AUC that is based on observed proportions requires some degree of smoothing. Our proposed estimators implement the degree of smoothing in different ways. The local estimator -cause-specific weighted mean rank (cWMR) -is a local average of unsmoothed time specific observed cause-specific AUCs within a neighborhood of a given time t. cWMR is sensitive to neighborhood span. The width of the neighbourhood directly influenced the smoothness of the curves especially towards the end of the study when size of the riskset gets very small. We have considered cross-validation approach to choose optimal neighborhood span. Use of adaptive smoothing techniques, for example, fixing the number of neighbors Fig. 1 Estimates of incident/dynamic cause-specific AUC(t) curves using weighted mean rank (cWMR) and fractional polynomial (cFPL) for Liver Transplantation data. Plots (a) and (b) illustrate the incident/dynamic AUC(t) curve for all-cause mortality, incident/dynamic cause-specific AUC(t) curves for graft-related death and non-graft-related death. Plots (a) and (b) are estimated using cWMR and cFPL methods, respectively. Plots (c) and (d) illustrate the incident/dynamic cause-specific AUC(t) curves for graft-related death with pointwise 95% confidence intervals (CI) using cWMR and cFPL methods, respectively. Plots (e) and (f ) illustrate the incident/dynamic cause-specific AUC(t) curves for non-graft-related death with pointwise 95% CIs using cWMR and cFPL methods, respectively instead of a fixed bandwidth may be useful. Instead of using local average of concordance within a neighborhood of time, we propose an alternative method based on global curve fitting approach, cFPL, which estimates the causespecific AUC as a function of time through a flexible fractional polynomial function. It expresses the unsmoothed intermediate AUCs as a function of fractional polynomials of time and then estimates the coefficients of polynomials through a partial likelihood optimization. cFPL overcomes the issue of sensitivity to neighborhood span of cWMR. However, oversmoothing is an issue with cFPL in both small and large time points. In terms of computation time, cWMR is computationally efficient compared to cFPL. We also develop the large sample properties of both estimators as well as their corresponding analytical variances. Our simulation study suggests that these two estimators perform very well compared to the existing semi-parametric method for measuring cause-specific AUC. The performance has been evaluated in terms of absolute relative bias and coverage probability. Between our two proposed approaches, both estimators show similar level of relative bias particularly for small predicted time in simulation studies. However, for large times the cFPL estimator shows large bias compared to the cWMR estimator. In addition, the coverage probability of cFPL estimator is very different from the nominal coverage probability particularly at the edges. In LT study, our goal is to evaluate the performance of FIB-4 as a baseline biomarker to discriminate between subjects who died due to graft related causes against those who died of nongraft related causes after the liver transplantation. Our analysis indicates a better discrimination of graft related deaths than non-graft related deaths after LT for most of the study duration. The estimation methods assume that the censoring time is independent of the survival time. It warrants additional research to allow covariate dependent censoring. Furthermore, in our settings, we do not consider time-varying marker. However, the proposed methodology is equally applicable to settings where time-varying marker exist.
Finally, between two estimators, which one is better: cWMR and cFPL? There is no general answer. The former is more adaptive to the local changes while the latter is good for an overall description. Our advice is to perform sensitivity analysis in which the choice of estimation methods vary. Also, for all methods the estimates at the higher time range are unstable, emphasizing the fact that one should have a sufficient number of events for estimation of the cause-specific AUC. This problem may be avoided by choosing a sufficiently wide neighborhood (e.g. fixing the number of neighbors) when using the cWMR. For cFPL, we can choose relevant short future time horizon over which we have sufficient cause-specific cases for numerically stable results.
Our study has some limitations. The first limitation is related to the bandwidth selection in cWMR approach. To implement the proposed cWMR estimator in practice, the appropriate bandwidths must be chosen. In this manuscript, we have used leave-one-out cross validation approach [9]. It would be interesting to perform sensitivity analysis by varying different bandwidth selectors. Next, our proposed approaches are not applicable when the data have missing information. However, following a typical imputation method, our approach could be applied to the imputed data. Details of the inference and sensitivity to the imputation methods is yet to be explored. Furthermore, the variance calculation of cFPL estimator is computationally intensive. Future research to explore alternate methods for efficient computation may be worthwhile. Moreover, the estimation methods assume that the censoring time is independent of the survival time. Additional research to allow covariate dependent censoring is warranted. In addition, in our settings, we do not consider time-varying marker. However, the proposed methodology is equally applicable to settings where time-varying marker exist. This is to be explored in the future.

Conclusions
We developed estimation procedures of estimating timedependent prognostic accuracy measures for a rightcensored time-to-event outcome in the presence of competing risks. The proposed methods are non-parametric, direct and computationally simple that will overcome the shortcomings of the existing approach. The methods will provide computationally efficient options for assessing the prognostic accuracy of markers for time-to-event outcome.