First, the general definitions underlying the simulation analysis shall be explained. Thereafter, the three simulation parameters that are varied to generate different scenarios are introduced. It follows an overview of the suggested three approaches to design risk-based (and time-dependent) screening schemes, which are compared in each of these scenarios. Lastly, these three approaches are explored in a real-world clinical case study.
General definitions
Let D be a disease that occurs at time tocc in individuals of an initially disease-free population (N = 5000). D usually does not cause symptoms in an early, curable stage but can be detected by a hypothetical examination E with, which is performed at time ti. The chance that D is fully curable decreases with the growing length of the time interval between its occurrence and the following screening examination/detection. In principle, E could be performed at any time ti, but for simplicity and to ensure congruency with the real-world clinical case study described later, it may only be done at five fixed time points after the initialization of screening, yielding a maximum of five screening examinations per individual: ti = i, i ∈ {1, 2, 3, 4, 5 years}.
It shall be assumed further that there exists a prediction algorithm PA which estimates interval-specific risks for the occurrence of D in a single individual k, k ∈ [1; 5000] conditional on the fact that this individual did not experience the event in the previous interval. PA gives five independent risk estimations pk,i(D), i ∈ {1, 2, 3, 4, 5}; each of those covers the risk for the occurrence of D within a 1-year interval between two potential time points for screening examinations. The accuracy of PA is assumed to be perfect on individual and population level. Therefore, it is not necessary to additionally simulate “real” occurrences of D at specified times tocc: if a simulated individual is, for example, assigned a risk pk,1(D) = 1% for the first year, this is equivalent to 0.01 theoretical disease occurrences in this interval. The sum of every individual’s conditional risks/ occurrences aggregated over all 5 years yields the expected amount of disease occurrences on population level. Since an annual risk prediction does not provide further information concerning the exact onset time of D, it is assumed that D occurs on average exactly in the middle of an annual interval: tocc ∈ {0.5, 1.5, 2.5, 3.5, 4.5 years}.
Simulation parameters
The performance of risk-based screening approaches likely depends on the scenario they are applied to, which is characterised by features of the target population or the disease. To reflect certain key aspects of these features, three parameters are employed to simulate five individual consecutive annual risk predictions for three kinds of disease progression patterns and N = 5000 patients.
Coefficient of variation of mean risk (cvmr) across five years
At population level, the average cumulative 5-year risk per individual is set to 10% since many diseases show comparable incidence rates in clinical practice [17, 18]. If cvmr was zero, meaning no variation of the mean risk, the mean annual risk would be perfectly stable at 2% in each year. Setting cvmr to a value larger than zero induces variation concerning the distribution of the five mean annual risks over time; for example, the mean annual risks might be > 2% in years 1 and 2, and < 2% in years 3–5, while the average cumulative risk still sums up to 10% over 5 years. For the different simulation scenarios, cvmr was set to 0.5 (equivalent to a standard deviation of the mean annual risk, sdmr, of 1%) representing high, to 0.25 (equivalent to a sdmr of 0.5%) representing intermediate, and to 0.05 (equivalent to a sdmr of 0.1%) representing low variation of the mean annual risks over time (Fig. 1a, top line). Having generated the mean annual risks like this, the individual risk estimations per year were assumed to be normally distributed around the population average risk for the same year. The corresponding standard deviations were assumed to be directly inversely correlated to the log-value for a year’s mean risk on population level.
Parameter 2: Pearson correlation coefficient r
A patient with a relatively high value for pk,1(D) in the first year likely features relatively high values pk,i(D), i ∈ [2, 5] for the subsequent years as well since their risk profile always relies on the same fixed predictors. However, the influence of one or more predictors may change over time causing shifts in the relative size of consecutive risk estimations. In this simulation, the correlation parameter r was set to 0.2 to simulate low, to 0.5 to simulate intermediate, and to 0.8 to simulate high correlation between the five annual risk predictions on individual level. To maintain simplicity, it was assumed that r is constant between all five annual predictions. Figure 1a, lines two to four, illustrates the influence of r on the simulated annual risks of a randomly chosen sample patient, given the underlying population-level risk distribution, which is dependent on cvmr. Since r also influences whether changes over time in the relative risk of individuals, compared to other individuals, are common (low correlation) or rare (high correlation), the population-based quartiles for the annual risks are displayed as a reference.
Parameter 3: disease progression dp
Screening aims at early detection of disease manifestations to avoid severe or permanent damage, possibly even leading to premature death. While time to detection is always a crucial factor in this context, progression rates to an irreversible state differ between diseases. Sometimes there is still a fair chance to cure a disease several years after its first occurrence, sometimes the time window closes considerably faster, after 1 year or even earlier. In this simulation, it was assumed that progression to an irreversible disease state follows an exponential function returning an individual’s chance to be fully cured if her or his disease D, which occurred at time tocc, is detected at time ti: \(p\left({D}_{curable}\right)={dp}^{\left({t}_i-{t}_{occ}\right)}\). The function’s basis dp was set to 0.6 representing slow, to 0.3 representing intermediate, and to 0.1 representing fast disease progression.
In total, we defined 27 scenarios in which screening approaches will be compared. These represent all possible value-combinations of the three parameters cvmr, r, and dp.
Screening approaches
Performing a screening examination at time ti is assumed to result in the detection of an individual’s accumulated theoretical occurrences of D. It is supposed to detect each new occurrence of the disease since the last screening examination or the beginning of the screening programme if it concerns the first screening examination. If the maximum of five screening examinations was performed in all individuals, every occurrence of D, pk,i(D), would be detected after half a year on average. In this study’s setup, such a full screening approach would, by definition, yield the highest possible number of disease detections in a still curable state (nmax(Dcurable)) on individual and population level.
Due to economic or psychological constraints, a full screening approach might not be desired or feasible. However, reducing the number of screening examinations on individual and population level inevitably leads to delayed detection of D in some individuals, which is associated with a lower chance of curability. If a screening examination is supposed to detect not only occurrences of D from the previous, but also earlier years, the actual number of diseases which are detected in a still curable state (nactual(Dcurable)) decreases for earlier occurrences, dependent on the parameter dp and the actual detection delay ti - tocc.
Therefore, before different risk-based screening approaches are proposed and compared, it is important to define the desired target detection rate (tdr), which is the number of detected occurrences of D in a curable state on population level, divided by the maximum number of disease detections \(: tdr=\frac{n_{actual}\left({D}_{curable}\right)}{n_{max}\left({D}_{curable}\right)}\) . All screening approaches shall be compared regarding the number of screening examinations which is necessary to achieve exactly the predefined value for tdr. Since in the view of ethical considerations it seems unlikely that clinicians would accept an overly high “missing rate”, it was decided to assess three settings with a tdr of 80, 90, and 95%. Based on this, the following three approaches to develop individualised screening schemes are proposed and will be compared:
Cumulative approach (CA
All five annual risk predictions of PA for a patient are summed. A risk threshold corresponding to the pre-defined target detection rate tdr is determined and applied to the cumulative risk to discern between high-risk and low-risk patients. High-risk patients are assigned a screening examination after each year, while screening is completely omitted in low-risk patients. (Fig. 1b-1, lower-left panel).
Cumulative approach with interval-wise reevaluation (CAIR)
A patient’s annual risk predictions are summed up until a threshold corresponding to the pre-defined target detection rate tdr is reached. Consequently, this patient is assigned a screening examination which will be performed at the end of the last year contained in the sum. If a screening examination was performed, the cumulation process starts anew, beginning with the risk of the following year and continuing until the threshold is reached again. (Fig. 1b-2, lower-middle panel).
Interval-specific approach (ISA)
A patient is assigned a screening examination after every year in which her or his annual risk exceeds a threshold resulting in the pre-defined target detection rate tdr. (Fig. 1b-3, lower-right panel).
To compare these three approaches, a reference is required. While it seems unrealistic that anyone would rely on chance to allocate screening examinations, a random approach (RA) is the most appropriate reference for comparing approaches that are less extensive than full screening. The previously proposed three approaches will only provide added value in daily clinical practice if they require fewer screening examinations to achieve the predefined target detection rate tdr compared to screening allocation in the most uninformed, random way. The percentage of potentially avoided inefficient screening examinations of the approaches CA, CAIR, and ISA relative to the reference approach RA is calculated for all 27 simulated scenarios and three target detection rates. To ensure stability of the results and reduce stochastic uncertainty, each scenario was evaluated 100 times and results were averaged.
Clinical case study
Currently, there do not exist many prediction tools which facilitate the calculation of time-dependent risks. One of them is the INFLUENCE-nomogram [13]. Based on a patient’s age, tumour size, nodal involvement, grade, estrogen−/ progesterone-status, multifocality, radiotherapy, chemotherapy, and endocrine therapy, it estimates conditional annual risks of developing a locoregional breast cancer recurrence (defined as reappearance of the tumour in the ipsilateral breast, chest wall or regional lymph nodes) within 5 years after diagnosis. The INFLUENCE-nomogram’s algorithm is based on data of more than 37,000 Dutch patients diagnosed with early breast cancer between 2003 and 2006. Its external validity was recently demonstrated by applying it on a cohort of 6520 breast cancer patients diagnosed between 2000 and 2012 obtained from Tumorzentrum Regensburg (Institute for Quality Control and Health Services Research of University of Regensburg), a clinical cancer registry in Germany [19]. The same cohort is used in this study to examine how follow-up patterns and rates of missed locoregional recurrence (LRR) events might look like, if the approaches CA, CAIR, and ISA were applied in clinical practice.
Following actual guideline recommendations, the potential time points for screening/ follow-up examinations were set to 1, 2, 3, 4, and 5 years, like in the previously described simulation analyses. While the parameters cvmr and r are fixed given the real individual risk predictions of INFLUENCE, an assumption concerning disease progression of locoregional breast cancer recurrences had to be made. Based on clinical evidence that early detection yields significant advantages in survival [14], it was assumed to be at least intermediate and the parameter dp was again set to 0.3. Additionally, the target detection rate tdr was set to 90%. In contrast to the simulation analyses, the assumption of perfectly accurate predictions is not valid in a real-world example. Moreover, there do not exist theoretical partial disease occurrences in a single individual. Instead, the real recurrence events observed in the German population were counted as detected by the consecutive follow-up examination which would have been assigned to a patient. Of course, there does not exist a perfect examination procedure in breast cancer follow-up. Since the sensitivity of the usually employed – and sometimes combined - diagnostic procedures varies between 65% for mammography, around 90% for ultrasound and up to 100% for MRI, an overall examination sensitivity of 80% was incorporated in the analyses [20]. To quantify the impact of such a screening examination, the real detection delays were determined and used to calculate a patient’s chance to be fully cured.
For all analyses, R version 3.5.1 (R Foundation for Statistical Computing, Vienna, Austria; http://www.R-project.org/) was used.