In this paper we have developed and implemented methods for estimating and projecting incidence, as well as the lifetime risk of a disease based on observation of incident events in an observation window, i.e., what we termed cohort-of-cases data. The developed methodology yields non-parametric estimates comparable to those of a standard Nelson-Aalen analysis based on independent delayed entry, but it gives slightly better projections of incidence due to its implicit accounting for the unobserved mortality among untreated in the past.

In its simplest form–i.e., assuming both a stationary birth process and incidence–a simple non-parametric estimate of the age of onset distribution is obtained. When alternatively the birth process is considered known, this is taken into account by a weighted, non-parametric estimate with weights based on the relative sizes of the relevant birth cohorts. Both approaches directly provide estimates of age-specific incidence as well as of lifetime risk, which are of considerable public health interest. Due to the relatively fast computational procedures developed, confidence intervals for the lifetime risk could be obtained from direct application of bootstrap methodology. We were however unable to provide confidence intervals for projection of incidence.

As stated by Narayan *et al.* in 2003, lifetime risk of diabetes appears not to have been estimated prior to their paper [8], and only one subsequent paper have reported comparable estimates of lifetime risk [24]. The directly comparable estimates for the US population found in [8] are substantially higher (39% for females, 33% for males) than ours (14% for females, 16% for males). The two major reasons for the difference is a generally lower diabetes incidence in Denmark [4], as well as the fact that our estimates only pertain to pharmacologically treated diabetes. It would however be interesting to explore if part of the difference is due to their use of the traditional method, as the traditional method in our material leads to an elevated estimate of lifetime risk of 16% for females and 18% for males. It is further interesting that the gender differences are in opposite directions in the two countries.

Several papers have used estimates of incidence to project the future burden of diabetes, most prominently [2, 5, 6]. For all three, it would be interesting to re-analyze their data using our developed method for cohort-of-cases data, if possible, to see if a similar discrepancy exist between the two analytical methods as we have found, where the traditional method lead to an inflated projection of the number of incident events of diabetes, when compared to the observed count.

For the theoretical developments, assumptions (S1) and (S2) have been crucial, but from an applied perspective the assumptions are very restrictive. In our application concerning diabetes, the assumptions are likely not satisfied, as it is questionable that both age-specific incidence and age-specific mortality among diabetics have been constant since 1900–rather, changes in incidence due to altered lifestyle, and changes in mortality due to improved treatment and general health are reasonable. Indeed, it is known that within the observation window of 1993 and 2003, statistically significant trends exist for both quantities [4]. Yet the predictions based on the developed model are at least as good as those based on the ordinary non-parametric method, showing the potential of the developed model. More work on relaxing the assumptions is however mandated before the model can be used more generally.

Although we in principle showed how the stationarity assumption could be relaxed by formulating a full, parametric likelihood, we did not give a detailed analysis of this situation due to its complexity. Also, the data considered in this paper are rather limited since, first, the observation window is short compared to typical disease duration, and second, no information is available on age of onset outside the observation window. As a result, we have been unable to allow for trends in incidence and mortality, the absence of which must be considered unrealistic. In some epidemiological settings it will, however, be possible to obtain data on age of onset for subjects prevalent at start of the time window or for diseased subjects dying in the observation window [25]. While such information is valuable and needs to be incorporated in the analysis to allow relaxation of assumptions, it requires knowledge about the past mortality among diabetics. In contrast, we have tried to develop a methodology that only rely on observation of incident events and past birth rates, which are often easier to obtain. There is, however, a need for further research on the applicability and extensions of the method before its potential can be more clearly appreciated.