 Research Article
 Open Access
 Open Peer Review
 Published:
Analysing detection of chronic diseases with prolonged subclinical periods: modelling and application to hypertension in the U.S.
BMC Medical Research Methodology volume 19, Article number: 213 (2019)
Abstract
Background
We recently introduced a system of partial differential equations (PDEs) to model the prevalence of chronic diseases with a possibly prolonged state of asymptomatic, undiagnosed disease preceding a diagnosis. Common examples for such diseases include coronary heart disease, type 2 diabetes or cancer. Widespread application of the new method depends upon mathematical treatment of the system of PDEs.
Methods
In this article, we study the existence and the uniqueness of the solution of the system of PDEs. To demonstrate the usefulness and importance of the system, we model the agespecific prevalence of hypertension in the US 1999–2010.
Results
The examinations of mathematical properties provide a way to solve the systems of PDEs by the method of characteristics. In the application to hypertension, we obtain a good agreement between modeled and surveyed agespecific prevalences.
Conclusions
The described system of PDEs provides a practical way to examine the epidemiology of chronic diseases with a state of undiagnosed disease preceding a diagnosis.
Background
Chronic noncommunicable diseases (NCDs) have emerged as a major global burden, accounting for 40 million of the 56 million global deaths in 2016. About 18 million of those deaths were due to cardiovascular disease [1]. Although hypertension is an NCD by itself, it is also an important risk factor for cardiovascular disease, stroke and other chronic diseases like, e.g., kidney disease [2]. As hypertension is without symptoms at an early stage of the disease, an enormous number of people suffer from undiagnosed hypertension, delaying effective preventive treatment. For example, in a nationally representative survey in China Gao et al reported that more than 70 percent of men and women aged 2044 years with hypertension did not have a diagnosis [3]. With a view to hypertension as a risk factor for chronic diseases, the World Health Organization has identified “detection, treatment and control of hypertension” as one of the objectives in the Global Action Plan For the Prevention and Control of NCDs [4].
Recently, we developed a fourstate model to systematically examine how incidence and prevalence of undiagnosed chronic diseases and possible subsequent diagnosis are related [5]. The fourstate model is an extended illnessdeath model, which additionally comprises a state of being undiagnosed before possibly transiting to the diagnosed state. The fourstate model is related to a twodimensional system of partial differential equations (PDEs). However, no rigorous analysis of the mathematical properties, e.g., classification of the type, existence and uniqueness of the solution of the system of PDEs to facilitate application and use has been published.
In this paper, we prove the existence and uniqueness of the solution of the twodimensional system of PDEs and then apply the system of PDEs to model the agespecific prevalence of undiagnosed and diagnosed hypertension in the US.
Methods
After a short derivation of the system of PDEs based on the fourstate model, we use the method of characteristics to prove existence and uniqueness of the solution of the PDE. The method of characteristics is a classical tool in order to prove wellposedness of PDEs. It also opens a way to calculate this unique solution. Readers who are not familiar with PDEs may find introductory texts by Zachmanoglou & Thoe [6] and DuChateau & Zachmann [7].
We demonstrate usefulness of the fourstate model in modelling the prevalence of undiagnosed (p_{1}) and diagnosed (p_{2}) hypertension for different agegroups in the period from 1999 to 2010. With reasonable assumptions about the incidence of hypertension and mortality data from the US, we show that the fourstate model can achieve a good agreement with the observed prevalence data about hypertension from the nationally representative National Health and Nutrition Examination Survey (NHANES) in the US. The assumptions about the incidence and mortality rates are detailed in the next section. The reason why we have to make (reasonable) assumptions – instead of using published data – is that the required data are not available. Especially, the mortality of people with undiagnosed hypertension is difficult to survey.
In NHANES, hypertension was defined as systolic blood pressure ≥ 140 mm Hg or diastolic blood pressure ≥ 90 mm Hg, or being on antihypertensive medication. Agespecific prevalence of hypertension (p_{1}+p_{2}) has been reported for the years 1999 to 2010. Awareness of hypertension has also been surveyed. Awareness was defined as the fraction of the population who has been informed of a hypertension diagnosis. Thus, awareness corresponds to the fraction \(\tfrac {p_{2}}{p_{1}+p_{2}}\). This information allows calculation of agespecific prevalence of undiagnosed and diagnosed hypertension. It is not our aim to make the best possible fit between the modelled and the observed prevalences, but rather to show that a reasonable fit is easily possible. As we do not intend the best fit, which indeed could be the subject of a paper on its own, readers should not be tempted to make inferences about the underlying epidemiological rates.
Results
The chronic disease model with four states
To analyse a population with respect to the chronic disease, we consider the compartment model from our previous work [5] as shown in Fig. 1. The model is the wellknown illnessdeath model [8] with an additional state that comprises people with undiagnosed disease. The numbers N_{j}, j=0,1,2, as well as the transition rates shown in Fig. 1 depend on the calendar time \(t \in \mathbb {R}\) and on the age a, a∈[0,∞). N_{j}(t,a) denotes the number of people in state j, j=0,1,2, aged a at time t.
With the assumption that there is no migration, we have shown in [5] that the numbers N_{j} are solutions of the following system of partial differential equations (PDEs):
For brevity, we have written ∂_{t} for \(\tfrac {\partial }{\partial t}\) and ∂_{a} for \(\tfrac {\partial }{\partial a}\). In addition, we set N(t,a):=N_{0}(t,a)+N_{1}(t,a)+N_{2}(t,a) for the overall number of people aged a at time t.
System of PDEs for the agespecific prevalence
In chronic disease epidemiology, it is common to consider the fractions of people who are in the disease states instead of their absolute numbers N_{j}. For this, set \(p_{j}(t,a):=\tfrac {N_{j}(t,a)}{N(t,a)}\) for j=0,1,2. By using
and defining the overall mortality μ=μ_{0}p_{0}+μ_{1}p_{1}+μ_{2}p_{2}=μ_{0}(1−p_{1}−p_{2})+μ_{1}p_{1}+μ_{2}p_{2}, we can deduce the following PDEs from Eqs. (2) and (3).
Instead of the three Eqs. (1) – (3), only two equations are necessary to describe the model in Fig. 1. The fraction p_{0} can be obtained from the equation p_{0}+p_{1}+p_{2}=1. Eqs. (4) and (5) define a twodimensional system of linear PDEs.
Remarks 1
We notice that Eqs. (4) and (5) actually represent a nonlinear system by the fact that μ=μ_{0}p_{0}+μ_{1}p_{1}+μ_{2}p_{2}. However, in practice the mortality rate μ of a population can be deduced from empirical data. Thus, we can assume (4) and (5) to be linear.
With the definitions
Since I is the identity matrix, system (6) is hyperbolic [7].
Now we mimic the method of characteristics for systems of PDEs [6,9]. We consider the initial curve \(\mathbb {R} \ni t\mapsto \gamma (t) := (\gamma ^{(1)}(t), \gamma ^{(2)}(t)) := (t,0) \in \mathbb {R}^{2}\) and assume for the time being that λ_{j}, μ_{j}, and μ are sufficiently smooth. Then, we have
which shows that γ is not characteristic.
The characteristic curves y=(y^{(1)},y^{(2)}) in the taplane along γ are determined by [6,9]
This system is solved by \(y_{t_{0}}(s)=(t_{0}+s,s)\). Setting \(\psi (t_{0},s):=y_{t_{0}}(s)\phantom {\dot {i}\!}\), we obtain for its inverse ψ^{−1}(t,a)=(t−a,a). The solution of Eqs. (4) and (5) hence is given as
where V(t_{0},·) represents the solution of the ordinary differential equation (ODE)
where \(\boldsymbol {p}_{0}(t_{0})=\boldsymbol {p}(t_{0},0) \in \mathbb {R}^{2}\) is a given initial value.
Before we utilize the outcome on existence and uniqueness of the solution of the PDE system (6) obtained so far, we describe the geometrical meaning of (9). Note that the calculation above motivates to identify a=s. Then, for \(t_{0} \in \mathbb {R}\) the line segment given by \(y_{t_{0}}(a) = (t_{0} + a, a), ~a\in [0, \infty)\), is a characteristic curve for (6). One of these line segments in the taplane is shown in Fig. 2. The line segment starts at (t,a)=(t_{0},0) and has slope 1. The line segment can be seen as the trajectory of a group of persons born at the same point in time t_{0} (birth cohort) which gradually grows older. In demography and less frequently also in epidemiology, such a representation of the taplane is called a Lexis diagram [10]. Line segments with slope 1 starting on the abscissa like the one depicted in Fig. 2 are called life lines [11]. Now, we notice that the system of ODEs (9) can be written in terms of p as
With this terminology, we see that system (10) describes the change of the prevalence p along the life lines in the Lexis diagram.
Next, we state the existence and uniqueness of the solution of the PDE system (6) as a theorem. We consider two kinds of initial curves depending on the domain S where the right hand side of Eq (6) is defined.
 1.
The domain S is the upper halfplane, i.e., \(S = \mathbb {R} \times [0, \infty)\). Then, the initial condition is given on the real line defined by a=0, i.e., for all \((t, a) \in \delta S : = \mathbb {R} \times \{0\}\). Here the initial condition reads
$$ \boldsymbol{p}(t,0)=\boldsymbol{p}_{0}(t)\quad (t\in\mathbb{R}). $$(11)  2.
The domain S is the first quadrant, i.e., S=[0,∞)^{2}. Then, the initial condition is given on the union of two orthogonal halflines, i.e., for all \((t, a) \in \delta S : = \bigl ([0, \infty) \times \{0\} \bigr) \cup \bigl (\{0\} \times [0, \infty) \bigr)\). In this case the initial conditions are given as
$$ {}\boldsymbol{p}(t, 0) = \boldsymbol{p}_{0}(t)\quad (t>0)\quad\text{and}\quad \boldsymbol{p}(0, a) = \boldsymbol{p}_{1}(a)\quad (a>0). $$(12)
Theorem 1
Let the rates \(\lambda _{i}, \mu _{j}, \mu : \bar S \rightarrow [0, \infty)\) be continuously differentiable for i=0,1 and j=1,2, where \(\bar S\) denotes the closure of S. Furthermore, let p_{0}:[0,∞)→[0,1]^{2} be continuously differentiable. In case that S is the first quadrant also assume that p_{1}:[0,∞)→[0,1]^{2} is continuously differentiable and that the compatibility conditions p_{0}(0)=p_{1}(0) and p0′(0)=p1′(0) are satisfied. Then the system (6) with initial condition (11), if S is the upper halfplane, or with initial condition (12), if S is the first quadrant, has a unique continuously differentiable solution \(\boldsymbol {p}:\bar S\to \mathbb {R}^{2}\).
Proof
In the same way as in (7) it can be seen that also the initial hyper plane δS is not characteristic in case that S is the first quadrant. Due to the given assumptions on the data, by the PicardLindelöf Theorem the system (9) is uniquely solvable. Thus, the solution of (6) can be constructed as demonstated above, that is, it is given by (8) and has the claimed regularity. □
The equivalence between the systems (6) and (10) point out a possible way to calculate the unique solution of system (6) with initial condition (11) or (12), respectively. For this purpose, classical numerical methods for systems of ODEs like, e.g., the RungeKutta method, can be used [12]. This will be demonstrated in the next section.
Undiagnosed and diagnosed hypertension
Figure 3 shows the prevalence of undiagnosed (left) and diagnosed hypertension (right) in the age range 1870 years during the years 19992010 as surveyed in NHANES [13].
Similar to the Lexis diagram, the abscissa and ordinate represent the calendar time (t) and the age (a), respectively. The colour and the contour lines indicate the prevalence (in percent). For instance, the prevalence of undiagnosed hypertension for 60 year old people in the year 2000 was about 14%. In 2006, the prevalence of undiagnosed hypertension decreased to about 10% for people aged 60. During the same period, the prevalence of diagnosed hypertension for 60 year old people has increased from slightly less then 35% to about 40%.
Now, we calculate the unique solution of the system (6) for (t,a)∈[1999,2010]×[0,70]. As initial condition, we chose p(t,0)=(0,0) for all t≥1999 and p(1999,a)=p_{0}(a) for all a≥0. Here, p_{0}(a) is the agespecific prevalence as surveyed in 1999 [13]. We have an initial condition given on two halflines. The mortality rate μ of the US general population for the period 1999–2010 has been taken from the Human Mortality Database [14]. For the mortality rates in the states Undiagnosed Hypertension (μ_{1}) and Diagnosed Hypertension (μ_{2}), we assume μ_{j}=R_{j} μ, j=1,2, where R_{1}=1.1 and R_{2}=1.2, respectively. Currently, there are no data about mortality of people with undiagnosed and diagnosed hypertension compared to the general population. Based on NHANES data, values between 1.091.49 have been reported for untreated hypertension compared to controlled hypertension [15]. Thus, the magnitude of our choice seems reasonable. However, the exposition states in [15] are differently defined from our model (see Fig. 1). Moreover, we believe that these values are slightly overestimated because the study design of [15] cannot not take into account possible changes from Untreated Hypertension to Controlled Hypertension after baseline. Hence, people untreated at baseline may later be treated and may thus have a reduced mortality with this treatment. The incidence rates λ_{0} and λ_{1} have been determined by decomposing these rates into a timedependent factor \(\lambda ^{(T)}_{j}\) and an agedependent factor \(\lambda ^{(A)}_{j}\) [16]:
Although there are systematic ways to estimate the rates λ_{j}, j=0,1, as described in [5], we only made coarse guesses for \(\lambda ^{(A)}_{j}\) and \(\lambda ^{(T)}_{j}, ~j=0,1,\) such that the modelled prevalence approximates the surveyed prevalence (see Fig. 3). The source code for running the calculations to be run with the freely available statistical software R (The R Foundation for Statistical Computing) is given as Additional file 1.
Figure 4 shows the modelled prevalence that has been obtained by solving the initial value problem described above. After transforming the two dimensional PDE (6) with initial condition into the corresponding initial value problem of the ODE (10), the classical RungeKutta method of fourth order has been applied to calculate p(t,a) for (t,a)∈[1999,2010]×[0,70].
Overall we see a good agreement between the surveyed and the modelled prevalence. For a direct comparison we plot the surveyed and the modelled agespecific prevalence for the year t=2010 in Fig. 5.
Discussion
In this article, we have proven the existence and uniqueness of the solution of a recently published system of PDEs that describes the prevalence of undiagnosed and diagnosed chronic diseases. The proof uses the method of characteristics to transform the initial value problem of the PDE into an associated initial value problem of an ODE. Apart from the theoretical considerations, the method of characteristics provides a practical way to calculate the unique solution of the initial value problem. We have demonstrated this method in an example about hypertension in the US. The solution of the initial value problem agrees well with the observed prevalence data of hypertension obtained from a representative sample of the US population. Undiagnosed hypertension is a problem in the US and many other populations, because it is a risk factor for several severe health conditions such as stroke, cardiovascular disease and kidney disease.
In epidemiological applications of the proposed framework, input data usually are subject to statistical uncertainties, e.g., due to possible sampling errors. In order to solve the system of PDE in the presence of uncertainty, we suggest to use a multidimensional probabilistic approach, which randomly samples from the probability distributions of the input parameters, solves the PDEs (4) and (5) based on these samples, and then assesses the distribution of the results. The underlying ideas are detailed by Oakley and O’Hagan [17] and have been successfully applied in a public health setting [18].
Our work has several advantages and disadvantages. On the one hand, the disease model is relatively generic and can be applied to any chronic disease with a considerable state of undiagnosed disease. No assumptions about the form of the involved transition rates in the model have been made. In this way, the model is nonparametric.
In its current form, the model assumes that there is no migration from or into the considered population, which might be seen as a drawback. However, additional rates representing immigration or emigration can be added to Eqs. (4) and (5) following the corresponding considerations as in the normal illnessdeath model (without the undiagnosed state) [19]. Another drawback is that some of the epidemiological figures of the disease model are difficult to estimate in practice. While the agespecific prevalence of undiagnosed and diagnosed hypertension can easily be surveyed by crosssectional studies, estimation of the mortality rates for undiagnosed and diagnosed hypertension is difficult. The study design of NHANES includes a linkage with the US mortality register. However, changes of the hypertension status between the NHANES examination and death (from no hypertension to undiagnosed hypertension, from undiagnosed to diagnosed hypertension) cannot be taken into account. This possibly leads to a misclassification error where death cases are attributed to the wrong disease state. A theoretical alternative might be a cohort study to assess the mortality of undiagnosed hypertension (μ_{1}). However, keeping the information of surveydetected hypertension secret from a study subject without previous diagnosis of hypertension would be unethical. For our purpose of giving a demonstration about a possible application, we have made reasonable assumptions about the mortality rates μ_{1} and μ_{2} from the hypertension states.
The aim of our application to hypertension was to demonstrate usefulness of the disease model and the associated PDEs. Obtaining the highest degree of consistency between our modelled prevalence and the surveyed prevalence was not intended. Hence, the results should be used carefully for drawing conclusions about public health relevant questions.
The fourstate model and the associated PDEs have a variety of possible applications. For example, the model may help to understand which age groups should be taken special care of with respect to detection. When the model is stratified by subgroups of the considered population, e.g., by ethnicity, education, socioeconomic position etc., decision makers may obtain information about especially vulnerable parts of the population. This may form the basis for potential screening and intervention programmes. The impact of a potential screening programme for hypertension and other chronic diseases with prolonged states of undiagnosed disease such as coronary heart disease or cancer may be analyzed in advance.
Another straightforward application of the fourstate model and the associated PDE would be a prediction of future prevalence of undiagnosed and diagnosed hypertension using whatif scenarios. For example, it is possible to predict the consequences of different future time trends of the incidence of hypertension.
Finally, the model may help to analyse temporal trends of transition rates λ_{0} and λ_{1} between the states, which has been demonstrated in [5]. This question is important for assessing the quality of casefinding in the epidemiology of chronic diseases. Usually, prevalence based measures have been used for assessing casefinding [20]. However, we have shown recently that measures based on transition rates are more reliable [21].
Conclusions
In this article we have shown the existence and uniqueness of the solution of a system of partial differential equations that describes an extended illnessdeath model. Based on the usual illnessdeath model, a state of undiagnosed disease has been added, which can be used to model chronic diseases with a (possibly) prolonged state of undiagnosis preceding a diagnosis. As an example, we applied the model to hypertension in the US.
Availability of data and materials
The data used in this article were published in [13]. No further collection of individual persons’ data has been accomplished. All results can be reproduced from the source files provided as Additional file 1. The source files can be run with the freely available statistical software R (The R Foundation for Statistical Computing).
Abbreviations
 NCD:

noncommunicable disease
 NHANES:

National Health and Nutrition Examination Survey
 ODE:

ordinary differential equation
 PDE:

partial differential equation
References
 1
Steel N. Global, regional, and national agesex specific mortality for 264 causes of death, 1980–2016: a systematic analysis for the global burden of disease study 2016. Lancet. 2017; 390(10100):1151–210.
 2
Bakris GL, Ritz E. The message for world kidney day 2009: hypertension and kidney disease: a marriage that should be prevented. J Clin Hypertens. 2009; 11(3):144–7.
 3
Gao Y, Chen G, Tian H, Lin L, Lu J, Weng J, Jia W, Ji L, Xiao J, Zhou Z, Ran X, Ren Y, Chen T, Yang W. for the china national diabetes, group, mds. prevalence of hypertension in china: A crosssectional study. PLoS ONE. 2013; 8:65938. https://doi.org/10.1371/journal.pone.0065938.
 4
World Health Organization. Global action plan for the prevention and control of noncommunicable diseases 2013–2020. 2013. http://apps.who.int/iris/bitstream/10665/94384/1/9789241506236_eng.pdf. Accessed 19 June 2018.
 5
Brinks R, Bardenheier BH, Hoyer A, Lin J, Landwehr S, Gregg EW. Development and demonstration of a state model for the estimation of incidence of partly undetected chronic diseases. BMC Med Res Methodol. 2015; 15(1):98. https://doi.org/10.1186/s128740150094y.
 6
Zachmanoglou EC, Thoe DW. Introduction to Partial Differential Equations with Applications, Dover Books on Mathematics. Mineola: Dover Publications; 1986.
 7
DuChateau P, Zachmann D. Applied Partial Differential Equations, Dover Books on Mathematics. Mineola: Dover Publications; 2012.
 8
Kalbfleisch J, Prentice R. The Statistical Analysis of Failure Time Data, 2nd edn.Hoboken: Wiley; 2002.
 9
Evans LC. Partial Differential Equations. Providence: American Mathematical Society; 2002.
 10
Keiding N. Statistical inference in the lexis diagram. Philos Trans R Soc Lond A Math Phys Eng Sci. 1990; 332(1627):487–509.
 11
Keiding N. Event history analysis and the crosssection. Stat Med. 2006; 25(14):2343–64.
 12
Dahlquist G, Björck A. Numerical Methods. Englewood Cliffs: PrenticeHall; 1974.
 13
Guo F, He D, Zhang W, Walton RG. Trends in prevalence, awareness, management, and control of hypertension among united states adults, 1999 to 2010. J Am Coll Cardiol. 2012; 60(7):599–606.
 14
University of California, Max Planck Institute for Demographic Research. Human Mortality Database. 2017. www.mortality.org. Accessed 19 June 2018.
 15
Gu Q, Dillon CF, Burt VL, Gillum RF. Association of hypertension treatment and control with allcause and cardiovascular disease mortality among us adults with hypertension. Am J Hypertens. 2010; 23(1):38–45.
 16
Ades A, Nokes D. Modeling ageand timespecific incidence from seroprevalence: toxoplasmosis. Am J Epidemiol. 1993; 137(9):1022–34.
 17
Oakley JE, O’Hagan A. J R Stat Soc Ser B Stat Methodol. 2004; 66(3):751–69.
 18
Brinks R, Hoyer A, Kuss O, Rathmann W. Projected effect of increased active travel in german urban regions on the risk of type 2 diabetes. PloS ONE. 2015; 10(4):0122145.
 19
Brinks R, Landwehr S. Ageand timedependent model of the prevalence of noncommunicable diseases and application to dementia in germany. Theor Popul Biol. 2014; 92:62–8.
 20
Gregg EW, Cadwell BL, Cheng YJ, Cowie CC, Williams DE, Geiss L, Engelgau MM, Vinicor F. Trends in the prevalence and ratio of diagnosed to undiagnosed diabetes according to obesity levels in the us. Diabetes Care. 2004; 27(12):2806–12.
 21
Brinks R, Hoyer A, Rolka DB, Kuss O, Gregg EW. Comparison of surveillancebased metrics for the assessment and monitoring of disease detection: simulation study about type 2 diabetes. BMC Med Res Methodol. 2017; 17(1):54. https://doi.org/10.1186/s1287401703282.
Acknowledgements
The findings and conclusions in this report are those of the authors and do not necessarily represent the official position of the Centers for Disease Control and Prevention.
Funding
The authors have not recieved any funding for any aspect to this work.
Author information
Affiliations
Contributions
RB developed the differential equations, set up and analysed the simulation, and drafted the manuscript according to an initial idea of EWG. JS critically revised the mathematical foundations of the article based on a thesis of SK. AH critically revised the manuscript. All authors critically revised the text, gave important intellectual contributions and final approval of the version to be published.
Corresponding author
Correspondence to Ralph Brinks.
Ethics declarations
Ethics approval and consent to participate
The data used in this article were published in [13]. No further collection of data or material has been accomplished. Thus, ethics approval and consent to participate are not necessary for this work.
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Cite this article
Brinks, R., Kaufmann, S., Hoyer, A. et al. Analysing detection of chronic diseases with prolonged subclinical periods: modelling and application to hypertension in the U.S.. BMC Med Res Methodol 19, 213 (2019) doi:10.1186/s1287401908452
Received
Accepted
Published
DOI
Keywords
 Compartment model
 Incidence
 Prevalence
 NHANES