 Research article
 Open access
 Published:
Monte Carlo simulationbased estimation for the minimum mortality temperature in temperaturemortality association study
BMC Medical Research Methodology volumeÂ 17, ArticleÂ number:Â 137 (2017)
Abstract
Background
Rich literature has reported that there exists a nonlinear association between temperature and mortality. One important feature in the temperaturemortality association is the minimum mortality temperature (MMT). The commonly used approach for estimating the MMT is to determine the MMT as the temperature at which mortality is minimized in the estimated temperaturemortality association curve. Also, an approximate bootstrap approach was proposed to calculate the standard errors and the confidence interval for the MMT. However, the statistical properties of these methods were not fully studied.
Methods
Our research assessed the statistical properties of the previously proposed methods in various types of the temperaturemortality association. We also suggested an alternative approach to provide a point and an interval estimates for the MMT, which improve upon the previous approach if some prior knowledge is available on the MMT. We compare the previous and alternative methods through a simulation study and an application. In addition, as the MMT is often used as a reference temperature to calculate the cold and heatrelated relative risk (RR), we examined how the uncertainty in the MMT affects the estimation of the RRs.
Results
The previously proposed method of estimating the MMT as a point (indicated as Argmin2) may increase bias or mean squared error in some types of temperaturemortality association. The approximate bootstrap method to calculate the confidence interval (indicated as Empirical1) performs properly achieving near 95% coverage but the length can be unnecessarily extremely large in some types of the association. We showed that an alternative approach (indicated as Empirical2), which can be applied if some prior knowledge is available on the MMT, works better reducing the bias and the mean squared error in point estimation and achieving near 95% coverage while shortening the length of the interval estimates.
Conclusions
The Monte Carlo simulationbased approach to estimate the MMT either as a point or as an interval was shown to perform well particularly when some prior knowledge is incorporated to reduce the uncertainty. The MMT uncertainty also can affect the estimation for the MMTreferenced RR and ignoring the MMT uncertainty in the RR estimation may lead to invalid results with respect to the bias in point estimation and the coverage in interval estimation.
Background
Ambient temperature has been shown to be a risk factor for mortality in numerous epidemiological studies [1,2,3,4,5]. Researches have reported that there exists a nonlinear association between temperature and mortality, characterized by U or J shaped association [1,2,3]. One important feature in the temperaturemortality association is the minimum mortality temperature (MMT), which is defined as the temperature at which the lowest mortality is achieved. The MMT has been regarded as a threshold point in describing the population susceptibility to temperature [5] as mortality increases with temperature increasing or decreasing from the MMT. Therefore, the MMT is often used as a reference temperature to quantify the relative risk (RR) related to cold or hot temperatures in many previous studies [1, 5].
Despite the importance of the MMT, little research has been conducted on statistical inference on the MMT. A recently proposed approach for estimating the MMT in a nonlinear temperaturemortality association is to determine the MMT as the temperature at which mortality is minimized in the estimated temperaturemortality association curve [1, 5]. This approach provides a point estimate but the corresponding uncertainty is not quantified. Another study [6] proposed an approximate bootstrap approach to calculate the standard errors and the confidence interval for the MMT. The study applied the method to the data for 52 cities in Spain and showed that the uncertainty can be small or large depending on the estimated association pattern which varies among cities.
The statistical properties of the previously proposed methods were not fully studied. Our research aims to assess these methods in various types of the temperaturemortality association via a simulation study. Then, we suggest an alternative approach to provide a point and an interval estimates for the MMT, which may improve upon the previous approach if some prior knowledge is incorporated for the potential range of the MMT. We compare the previous and alternative methods through a simulation study and an application. Additionally, as the MMT is often used as a reference temperature to calculate the cold and heatrelated RRs [1], we assess how the uncertainty in the MMT affects the estimation of the RRs.
In Methods section, we describe (1) how we model the temperaturemortality association, (2) the previous and alternative methods to calculate a point and an interval estimates for the MMT, (3) the design of the simulation study, and (4) the modeling details of the US data analysis. In Results section, we report the results from the simulation study and the data analysis for the 135 US cities. We included discussions and conclusions in the two final sections.
Methods
Modeling the temperaturemortality association
Let Y _{ t } be the daily death count on day t, with tâ€‰=â€‰1â€‰,â€‰â€¦â€‰,â€‰N, and x _{ t }â€‰=â€‰(x _{ t },â€‰x _{ tâ€‰âˆ’â€‰1},â€‰â€¦,â€‰x _{ tâ€‰âˆ’â€‰L })^{â€²} be the vector of daily mean temperatures on day t and over the previous L days. We model the association between Y _{ t } and x _{ t } using a generalized linear model (GLM) with a quasiPoisson family.
where Î¼ _{ t } is the expected death count on day t, s(Â·) is a flexible function characterized by parameter Î· to depict the effects of temperature, u _{ jt } is the jth confounding variable measured on day t, h _{ j }(Â·) is a flexible function to represent the effects of jth confounding variable, and Î³ _{ j } is the corresponding parameter. We assume a quasiPoisson family to allow for overdispersion (meaning that the variance of the outcome counts is higher than predicted under a Poisson distribution) which is a wellknown feature observed in the timeseries analysis for temperaturemortality association [7]. For s(), we use the distributed lag nonlinear model (DLNM) [7] to describe the nonlinear and lagged dependency as has been used in many of the previous studies [1, 2, 5,6,7,8]. In DLNM, a crossbasis is specified for temperature and lag. Let \( {\phi}_1\left(\cdotp \right),\cdots, {\phi}_{v_x}\left(\cdotp \right) \) be the basis to describe the temperaturemortality association and \( {\psi}_1\left(\cdotp \right),\cdots, {\psi}_{v_l}\left(\cdotp \right) \) be the basis to depict the lagmortality association. The DLNM is expressed as.
where r _{ tj }â€‰=â€‰(Ï• _{ j }(x _{ t }),â€‰â€‰â‹¯â€‰,â€‰Ï• _{ j }(x _{ tâ€‰âˆ’â€‰L }))^{â€²} is the vector of x _{ t } transformed through the jth basis Ï• _{ j }(Â·) in the temperature dimension and c _{ k }â€‰=â€‰(Ïˆ _{ k }(0),â€‰â€‰â‹¯â€‰,â€‰Ïˆ _{ k }(L))^{â€²} is the vector derived by applying the kth basis Ïˆ _{ k }(Â·) for lag dimension to the vector (0,â€‰â€‰â‹¯â€‰,â€‰L)^{â€²}. Then, \( \boldsymbol{\eta} ={\left({\eta}_{11},\dots, {\eta}_{v_x{v}_l}\right)}^{\prime } \) is the vector of coefficients for the crossbasis with the dimension v _{ x }â€‰Ã—â€‰v _{ l }. Different choices of the basis can be considered in DLNM and model selection criteria such as QAIC or QBIC (Akaike and Bayesian information criteria for models with overdispersed outcomes fitted through quasilikelihood) can be used to determine an optimal choice [8].
In order to estimate the lagcumulated temperaturemortality association, Î· is reduced through the following transformation [9].
where \( \mathbf{M}={1}_{\left(L+1\right)}^{\prime}\mathbf{C}\bigotimes {\mathbf{I}}_{\left({v}_x\right)} \) is a reducing matrix, \( \boldsymbol{\beta} ={\left({\beta}_1,{\beta}_2,\dots, {\beta}_{v_x}\right)}^{\prime } \) is the reduced parameter, and V(Î²) is the associated error (co)variance matrix. In M,
and â¨‚ is the notation of the Kronecker product. Then, Î² is the parameter to describe the temperaturemortality association cumulated over the lags.
Estimating the minimum mortality temperature (MMT)
First, we describe the previously proposed approach to estimate the MMT [6]. Let \( \widehat{\ \boldsymbol{\beta}} \) be the maximum likelihood estimate obtained from model (1) through (3). Given \( \widehat{\ \boldsymbol{\beta}} \), the previously proposed point estimate for the MMT is a solution of \( \underset{x}{\mathrm{argmin}\Big(}{\boldsymbol{Q}}_x\widehat{\boldsymbol{\beta}}\Big) \) where \( {\boldsymbol{Q}}_x=\left({\phi}_1(x),\cdots, {\phi}_{v_x}(x)\right) \)is the vector of basis variables by applying the basis for temperature to a particular temperature value x, and x ranges from the minimum to maximum temperatures observed in the data. The solution can be the minimum or maximum temperature, in which case it has been suggested to constrain the solution within the 1st â€“ 99th percentiles of the temperature. To quantify the uncertainty, an approximate bootstrap method was proposed to derive the empirical distribution of the MMT. Based on the maximum likelihood principle [10], if the sample size is sufficiently large, it can be assumed that the true Î² follows a multivariate normal distribution with the mean as the estimate (\( \widehat{\boldsymbol{\beta}} \)) and the variance as the corresponding error (co)variance (\( \mathrm{V}\left(\widehat{\boldsymbol{\beta}}\right) \)) [11,12,13]. Then, one can simulate the true Î² and the true MMT through the following procedure.
where (i) indicates ith simulated sample, Î² _{(i)} are independent and identically distributed sample, and Î¸ _{(i)} are the samples to approximate the empirical distribution of the true MMT. Then, based on the empirical distribution of the MMT, it was proposed to use the empirical percentiles (i.2., 2.5th  97.5th) as an interval estimate for the MMT (i.e., 95% confidence interval (CI)).
Now, we describe an alternative procedure to estimate the MMT, which may improve upon the previous method when a prior knowledge is available on the MMT. In the previous approach, the empirical distribution for the MMT is determined by the multivariate normal distribution with mean (\( \widehat{\boldsymbol{\beta}} \)) and (co)variance (\( \mathrm{V}\left(\widehat{\boldsymbol{\beta}}\right) \)), and thus the uncertainty for the MMT tends to be large if \( \mathrm{V}\left(\widehat{\boldsymbol{\beta}}\right) \)is large. In such case, adding some restrictions for the MMT distribution based on a prior knowledge may reduce the uncertainty. Applying a Bayesian inferential framework, we specify a prior distribution for the MMT and combine it with the sampling procedure (5). That way, a posterior distribution for the MMT is derived as a tradeoff between the prior knowledge and the information in the data. In the context of the MMT, a realistic prior would be a Uniform distribution with a support (Î± _{1}â€‰,â€‰Î± _{2}) representing a plausible range of the MMT. The support can vary depending on the level of informativity of prior knowledge (e.g., minimally informative prior range: 1st â€“ 99th percentiles of observed temperature distribution or strongly informative range: 50th 70th percentiles). With such prior assumption, the posterior distribution can be obtained through the sampling procedure (5) by discarding the samples of Î¸ _{(i)} which do not fall within the range of (Î± _{1},Î± _{2}). That is,
Then, the empirical mean (or median) and percentiles (e.g., 2.5th  97.5th) can serve as a point and an interval estimates for the MMT. The empirical distribution of the MMT is often not symmetric but skewed, and in such case, the choice of percentiles may be adjusted depending on the shape of the empirical distribution (e.g., 0th â€“ 95th percentiles for a highly rightskewed case).
Estimating the relative risk (RR) accounting for the uncertainty of MMT
Here, we describe how we estimate an RR with the MMT used as a reference temperature accounting for the uncertainty in the MMT. Given the Monte Carlo samples of Î² _{(i)} and Î¸ _{(i)} obtained through procedure (5) or (6), one can calculate an RR comparing an arbitrary temperature value x and the MMT as
where Î¶ _{(i)} indicates the log of RR calculated using ith sample of Î² _{(i)} and Î¸ _{(i)} and RR _{(i)} is the ith sample of the true RR. Then, a point and an interval estimates for the RR can be derived from the empirical distribution of the RR in the same way as the MMT estimates. Often, scientific interest is on the cold and the heat related RRs which are defined as the RRs comparing the 1st percentile of temperature distribution and the MMT and comparing the 99th percentile and the MMT, respectively. Hereafter, we call these RRs as the cold and heat related RRs.
Simulation study
Simulation study was carried out to compare different methods in estimating the MMT and the cold and heatrelated RRs. We considered six methods. The first one (named as Argmin1) is to use the solution of the \( \underset{x}{\mathrm{argmin}\Big(}{\boldsymbol{Q}}_x\widehat{\boldsymbol{\beta}}\Big) \) without any constraint as a point estimate for the MMT and to use the MMT estimate for calculating the RRs. The second method (named as Argmin2) is the same as the first one except that the solution is constrained within the 1st  99th temperature percentiles. The third method (named as Empirical1) is to use the empirical mean and percentiles (2.5th â€“ 97.5th) as a point and an interval estimates for the MMT without any prior knowledge combined, and to calculate the RR accouting for the MMT uncertainty. The fourth, fifth, and sixth methods (named as Empirical2_{strong}, Empirical2_{moderate}, and Empirical2_{minimal}) are the same as the third one except that the empirical distribution of MMT is derived with prior knowledge. Empirical2_{strong}, Empirical2_{moderate}, and Empirical2_{minimal} incorporate strongly, moderatly, and minimally informative priors, respectively.
To generate the data, four different scenarios were considered for the temperaturemortality association: Ushape (Scenario 1), reverse Jshape (Scenario 2), rotated Sshape (Scenario 3) and sector shape (Scenario 4). Additional file 1: Figure S1 displays the shape of the true RR curve and the true MMT. To obtain the model parameters for each scenario, we used part of the US data analyzed in the application section. For scenarios 1, 2, and 4, we fit eqation (1) for the data of New York with temperature metric as 0â€“2 day moving average, 0â€“1 day moving average, and the current day value, respectively. For scenario 3, the same model was fit with 0â€“3 day moving average for the data of Ockland. For all scenarios, we controlled for the day of week using indicator variables and for the longterm and seasonal pattern using natural cubic spline with 8 degree of freedom for each year. For s(Â·), as we use moving average as temperature metric, we used onedimensional basis (quadratic Bspline with the knots placed at the 10th, 75th, and 90th percentiles). Once the parameters are estimated, the mortality data were generated from the fitted model using the covariates in the data for each scenario. For the distribution for mortality, we considered QuasiPoisson distribution with the overdispersion parameter set to be equal to the model fit.
For each scenario, we generated 1000 replicates of dataset. For each dataset, we fitted eq. (1) with the same specifications used to generate the data and obtained the coefficient estimates. Because we use moving avearage as temperature metric, which is a special case of distributed lag nonlineear model, the coefficients in eq. (1) can be considered as the reduced coefficients (Î²) in eq. (3). Using the coefficient estimates, we estimated the MMT and the cold and heatrelated RRs by the six different methods. For Empirical2, we incorporated prior knowledge with different levels of informativity using Uniform prior with different supports. Empirical2_{strong} uses, as the prior support, the 70th  95th temperature percentiles for scenarios 1 and 3, the 40th  65th for scenario 2, and the 1st  10th for scenario 4. Empirical2_{moderate} uses the 50th 99th percentiles for scenario 1 and 3, the 30th â€“ 80th for scenario 2, and the 1st â€“ 50th for scenario 3. Empirical2_{minimal} uses the 1st 99th percentiles for all scenarios. These prior ranges are indicated in Additional file 1: Figure S1. To compare different methods, we calculated mean bias (Bias) and root mean squared error (RMSE) for the point estimate and coverage probability (%CP) and mean length (Length) of the interval estimate for the MMT and the cold and heatrelated RRs using the 1000 replicates of dataset.
Additionally, we conducted a series of sensitivity analysis to evaluate the robustness of different methods varying the sample size and the specification of the splines and knots in modeling the temperaturemortality association. We considered five methods excluding Empirical2_{moderate} as its performance is between Empirical2_{strong} and Empirical2_{minimal}. First, we varied the sample size, 5 and 10 years of data, and compare with the full period (22 years) of data. Second, we varied the splines, natural cubic Bsplines and quadratic Bsplines, in the true and fitted models. Finally, we varied the locations of the knots, a set of 25th, 50th, and 75th temperature percentiles and another set of 10th, 75th, and 95th percentiles, in the true and fitted models.
Application
We applied three methods (Argmin2, Empirical1, and Empirical2_{minimal}) to estimate the MMT and the cold and heatrelated RRs in the temperaturemortality association for 135 cities in the US for the period of January 1, 1985 to December 31, 2006. Daily mortality counts were obtained from the National Center for Health Statistics and nonexternal cause mortality counts were used (ICD9: 0â€“799; ICD10: A00â€“R99). Daily mean temperatures (24h mean) were obtained from the National Climate Data Center of the National Oceanic and Atmospheric Administration. These data were analyzed in a previous study [1] and the cityspecific descriptive statistics are reported in Additional file 1: Table S1.
For each city, we fit eqs. (1) and (2) with the following modeling choices. For crossbasis, the quadratic Bspline was used with the knots placed at the 10th, 75th, and 90th percentiles of the cityspecific temperature distributions. For the lagged dependency, we used the natural cubic Bspline with an intercept and three internal knots (equally spaced values in the log scale) with 21 lag days. We controlled for the day of week using indicator variables and for the seasonal and longterm trends via a natural cubic Bspline of time with 8 degrees of freedom per year. These choices were based on the results in a previous study [1]. Because the cityspecific modeling accompanies relatively large estimation error, we combined evidence across all cities using multivariate metaregression [14] with cityspecific average temperature and temperate range as metapredictors and obtained the best linear unbiased predictor (BLUP) \( \widehat{\boldsymbol{\beta}} \) and the corresponding standard error for each city. Then, using the BLUPs, we applied the three methods for estimating the MMT and the cold and heat related RRs. For Empirical2_{minimal}, we assumed Uniform prior with the support as the 1st  99th percentiles of cityspecific temperature.
Results
Simulation study
Table 1 reports the results in estimating MMT by six different methods. Because Argmin1 and Argmin2 do not provide an interval estimate, the %CP and Length are not reported. In scenario 1, all six methods show small Bias and small RMSE. In scenarios 2â€“4, Argmin1 and Empirical1 show relatively large Bias and large RMSE (8.896 and 6.979 in scenario 2, 28.011 and 21.092 in scenario 3, and 8.686 and 6.259 in scenario 4) while Argmin2 and Empirical2â€™s show small Bias and small RMSE (mostly less than 3 and 3.592 at maximum). The %CP is near or greater than 95% for Empirical1 and Empirical2â€™s in all scenarios except that the %CP was relatively lower for Empirical2_{moderate} and Empirical2_{minimal} in scenario 4. In scenario 4, the empirical distribution of the MMT was observed as highly rightskewed with the two empirical methods and an adjustment to percentiles in the CI as the 0th â€“ 95th recovered over 95% coverage (98.6% and 98.3%, respectively). The Length was similar between Empirical1 and Empirical2_{minimal} and smaller for Empirical2_{moderate} and Empirical2_{strong} in all scenarios. In summary, our results suggest that Argmin2 (previously proposed in [6]) is a reasonable point estimator, though RMSE may increase depending on the association shape, and Empirical1 (previously proposed in [6]) provides an interval estimate with near 95% coverage while the length can be too large in some scenarios. If prior knowledge is available even at a minimal level, the RMSE can be reduced with Empirical2â€™s compared with Argmin2, and the Length of the interval estimate can be shorter with Empirical2â€™s compared with Empirical1 still achieving the similar level (near 95%) of coverage probability.
Table 2 reports the results in estimating the cold and heatrelated RRs by six different methods. For coldrelated RR, both Bias and RMSE were small and the %CP was near 95% in scenario 1 and 2. In scenario 3, RMSE was relatively large and the %CP was low as 86.8% with Empirical1 but near 95% with other methods. In scenario 4, both Bias and RMSE was small but the %CP was low as 58.4%, 70.3%, 76.5%, and 75.6% with Argmin2, Empirical1, Empirical2_{moderate} and Empirical2_{minimal}. For heatrelated RR, both Bias and RMSE were small and the %CP was near 95% in scenario 1 and 2. In scenario 3, RMSE was relatively large and the %CP was low as 85.8% and 89.2% with Argmin2 and Empirical1 but near 95% with Argmin1 and Empirical2. In scenario 4, RMSE was somewhat large and the %CP was low as 90% with Empirical1 but near 95% with other methods. In summary, in estimating the RR, Argmin1 seems to result in an appropriate coverage but may lead to large RMSE while Argmin2 may result in low coverage but small RMSE in various scenarios. Empirical1 can result in low coverage and large RMSE. However, Empirical2â€™s were shown to perform well generally in various scenarios in both aspects of RMSE and coverage even with minimally informative priors.
Additional file 1 Figures S2S7 report the results of the sensitivity analysis. Additional file 1: Figures S2 & S3 show the RMSE and %CP in estimating the MMT with varying sample size, Additional file 1: Figures S4 & S5 with different specifications of the splines, and Additional file 1: Figures S6 & S7 with different specifications of the knots. Additional file 1: Figure S2 shows that Empirical2_{minimal} and Empirical2_{stong} show lowest RMSE among other methods and stable RMSE over different sample sizes in all scenarios. Additional file 1: Figure S3 shows the %CP was comparable among Empirical1, Empirical2_{minimal}, and Empirical2_{strong} in all scenarios with an exception of scenario 4 where Empirical2_{minimal} shows slightly lower coverage for all sample sizes. As mentioned earlier, the coverage was recovered with an adjustment to the percentiles to obtain the CI. Additional file 1: Figure S4 indicates that the RMSE with Empirical2â€™s was lowest and stable over different choices of splines in the true and fitted models. Additional file 1: Figure S5 shows that the %CP was generally comparable although the coverage reduced significantly with Empirical2â€™s when the splines are missspecified (case 3 in scenario 1 and case 2 in scenario 3). In those cases, the CIs tended to miss the true MMT right above the upper bound due to the very short length. Finally, Additional file 1: Figure S6 displays that the RMSE was lowest and stable over different specifications of the knots in the true and fitted models. However, Additional file 1: Figure S7 shows that the missspecifications of the knots resulted in reducing the coverage in case 2 for scenario 1 and 3. Similarly, the CIs tended to miss the true MMT right above the lower bound
Application
Figure 1 displays the point estimates for the MMT obtained by Argmin2, Empirical1, and Empirical2_{minimal} and the interval estimates by the two empirical methods. The cities are ordered based on the length of the interval estimate calculated by Empirical1. Based on the MMT uncertainty patterns, it seems reasonable that we divide 135 cities into four categories; category 1 (from New York to Knoxville), category 2 (from Oakland to San Jose), category 3 (from Austin to Milwaukee), and category 4 (from Western Palm BeachBoca Raton to Seattle). Such categorization also consists with the shapes of the RR curve for each city presented in Additional file 1: Figure S8 (The cities are in the same order as in Fig. 1). In category 1, cities show clear Ushape RR curve and small uncertainty in the MMT. Accordingly, the point estimates were close among the three methods and the interval estimates were similar between the two empirical methods. In category 2, cities show Ushape RR curve with a short right arm and a short bottom and small uncertainty in the MMT. Argmin2 tends to suggest a smaller value for the MMT estimate, Empirical1 suggests larger point estimate, and Empirical2_{minimal} compromises between the Argmin2 and Empirical1 estimates. In category 3, the RR curve is mostly reverse Jshape or Ushape with a wide bottom and the MMT uncertainty become large. For the reverse Jshape curve, Empirical1 suggests the largest value for the MMT while Argmin2 and Empirical2_{minimal} suggests somewhat smaller values. In category 4, cities show rotated Sshape or widely opened Ushape with large uncertainty on both arms. Empirical1 suggests extremely large uncertainty in the MMT covering almost the whole range of the temperature distribution. However, the uncertainty reduced largely by Empirical2_{minimal} with the prior restriction, within the 1st â€“ 99th percentiles.
Figure 2 shows the point and interval estimates for the coldrelated and heatrelated RRs calculated by the three methods. Differently from the MMT estimates, the point estimates were mostly consistent among the three methods. However, the intervals estimates calculated by Argmin2 and Empirical1 tend to stretch to the right while those computed by Empirical2_{minimal} tend to do to the left. Such result is consistent with simulation results in that the coverage for the RR were low with Argmin2 and Empirical1 as those intervals tend to exclude the true RR because of the rightskewness of the empirical distribution.
Discussion
In this research, we assessed the statistical properties of the previously proposed statistical approach [6] to estimate the MMT in various types of association via a simulation study and an application. The method of using the solution of argmin function with some ad hoc restriction (i.e., within the 1st â€“ 99th percentiles of the observed temperature distribution) (Argmin2) turns out to be a reasonable point estimator for the MMT, though Bias or RMSE may be large in some scenarios. Also, the approximate bootstrap method to calculate the confidence interval (Empirical1) performs properly achieving near 95% coverage, though the length can be extremely large depending on the scenarios.
To improve upon the previous method, we suggested an alternative approach (Empirical2), which can be applied if some prior knowledge is available on the MMT. We suggested to combine a prior knowledge with the procedure of deriving the empirical distribution of the MMT and to use the empirical mean and percentiles (e.g., 2.5th  97.5th in general and 0th â€“ 95th for highly rightskewed case) as a point and an interval estimates for the MMT. Simulation study showed that our proposed method performs better even with a minimal level of prior knowledge reducing the Bias and RMSE in point estimation and achieving near 95% coverage while shortening the length in interval estimation.
We also examined how the uncertainty in the MMT would affect the RR estimation using the MMT as a reference temperature. We derived the empirical distribution of the MMTreferenced RR through a sampling procedure similar to deriving the MMT distribution. Then, the empirical mean and percentiles were used as alternative point and interval estimates for the RR with the uncertainty in the MMT accounted for. Compared with the current approach (using only a single point estimate for the MMT as a reference value in quantifying the RR and calculating the confidence interval based on the normal approximation), the empirical RR estimates, when prior knowledge is combined, were less biased with reduced RMSE and achieves appropriate level of coverage probability in most of the scenarios.
Our proposed approach conceptually relies on a Bayesian inferential framework but is not a fully Bayesian hierarchical model, which one may consider as a more natural way to incorporate a prior knowledge in the inference for MMT. However, our approach has several advantages compared with constructing a fully Bayesian model. When modeling a nonlinear association between temperature and mortality using splines, MMT is not a specific parameter but a complex function of parameters (i.e., \( \underset{x}{\mathrm{argmin}\Big(}{\boldsymbol{Q}}_x\boldsymbol{\beta} \Big) \) where Î² is reduced coefficients from Î·, which is the original coefficients for the crossbasis, and Q _{ x } is the vector of basis variables as in formula (3)) and, thus a prior cannot be directly assigned on the MMT in a fully Bayesian model. Although indirect specification through a prior on Î· may be possible, a common choice of prior on Î· (e.g., multivariate normal) does not yield a known or closed form of prior on the MMT, which makes it difficult to incorporate prior knowledge on the MMT straightforwardly. In addition, in a timeseries analysis for the temperaturemortality association, there are many other terms in the model to adjust for longterm trend, seasonality, and potential confounders. A fully Bayesian model would require a prior assumption and a posterior sampling on the whole parameter space including these parameters, which is often highdimensional. In contrast, our approach focuses only on the crossbasis terms for temperature, which does not only facilitate the procedure of inserting a prior knowledge on the MMT but also avoids such an unnecessary highdimensional posterior sampling for the nuisance parameters.
While our proposed approach has several benefits, some limitations should be acknowledged. First, if prior distribution is incorrectly specified, the whole inference can seriously be biased. To avoid such prior misspecification, one may use a prior information minimally by setting a potential range as the 1st â€“ 99th percentiles. Our simulation showed that even with the minimally informative prior, the proposed method tends to perform better than the previous approach. When analyzing the US data, adding such minimal prior information reduced the uncertainty in the MMT by large amount particularly when the estimated association curves are unstable in terms of suggesting an MMT (e.g., category 4). Second, our approach limits the prior choice to be a Uniform distribution with different supports reflecting different prior knowledge. Such prior cannot accommodate a case where a prior range is the whole temperature range with different prior probabilities across the range, which would be more plausible in practice. Although a Uniform with a truncated support would be a reasonable approximation, further research is merited to improve the method to encompass a broader range of prior knowledge.
Applying the methods to the US data, we found four categories in terms of the MMT uncertainty. For categories 1 and 2, the estimated temperaturemortality association is mostly Ushape with short or long arms on either side and with a short bottom. In these categories, the MMT uncertainty is small and estimated between the 75th through the 95th percentiles of the observed temperature distribution. For category 3, the association is mostly reverse Jshape with relatively long bottom on the right side and the MMT uncertainty was relatively large. The large uncertainty is induced by the long bottom of the association curve and it may be more appropriate to describe the MMT as a range, not a single point. The previous study [6] also suggested to introduce this new concept of the minimum mortality temperature range. In category 4, the association is the rotated Sshape with the left arm curving down at the lowest temperature, for which the uncertainty is very large. Such uncertainty on the left arm is induced by the sparse data and causes the MMT uncertainty to unnecessarily cover the whole range of the temperature. In this case, it is suggested that adding restrictions on the range would lead to more reasonable inference on the MMT.
Finally, an important note should be made about the MMT estimation in a DLNM modeling framework. When the temperaturemortality association is incorrectly modeled (e.g., misspecifications of the splines and knots), either the previous or alternative approaches may provide interval estimates with significantly low coverage as they can miss the true MMT just below/above the lower/upper bounds. Since the interval estimation for the MMT can be sensitive to the modeling choices in DLNM depending on the temperaturemortality association pattern, one should carefully conduct a model selection procedure to identify the correct association and, thus the true MMT. Additionally, missspecifying the outcome distribution such as a simple adjustment for overdispersion with QuasiPoisson family may also affect the inference, and more flexible statistical methods may be considered to account for it in further research [15].
Conclusions
In summary, Monte Carlo simulationbased approach to estimate the MMT either as a point or as an interval was shown to be a reasonable approach, particularly when some prior restrictions are added to reduce the uncertainty. The MMT uncertainty can affect the estimation for the MMTreferenced RR and ignoring the MMT uncertainty in the RR estimation may lead to invalid results with respect to bias in point estimation and the nominal coverage in interval estimation.
Abbreviations
 %CP:

Coverage probability
 DLNM:

Distributed lag nonlinear model
 MMT:

Minimum mortality temperature
 RMSE:

Root mean squared error
 RR:

Relative risk
References
Gasparrini A, Guo Y, Hashizume M, Lavigne E, Zanobetti A, Schwartz J, Tobias A, Tong S, RocklÃ¶v J, Forsberg B. Mortality risk attributable to high and low ambient temperature: a multicountry observational study. Lancet. 2015;386(9991):369â€“75.
Armstrong B. Models for the relationship between ambient temperature and daily mortality. Epidemiology. 2006;17(6):624â€“31.
Chung Y, Lim YH, Honda Y, Guo YL, Hashizume M, Bell ML, Chen BY, Kim H. Mortality related to extreme temperature for 15 cities in northeast Asia. Epidemiology. 2015;26(2):255â€“62.
Curriero FC, Heiner KS, Samet JM, Zeger SL, Strug L, Patz JA. Temperature and mortality in 11 cities of the eastern United States. Am J Epidemiol. 2002;155(1):80â€“7.
Guo Y, Gasparrini A, Armstrong B, Li S, Tawatsupa B, Tobias A, Lavigne E, de Sousa Zanotti Stagliorio Coelho M, Leone M, Pan X, et al. Global variation in the effects of ambient temperature on mortality: a systematic evaluation. Epidemiology. 2014;25(6):781â€“9.
TobÃas A, Armstrong B, Gasparrini A. Investigating uncertainty in the minimum mortality temperature: methods and application to 52 Spanish cities. Epidemiology. 2016;
Bhaskaran K, Gasparrini A, Hajat S, Smeeth L, Armstrong B. Time series regression studies in environmental epidemiology. Int J Epidemiol. 2013;42:1187â€“95.
Gasparrini A, Armstrong B, Kenward MG. Distributed lag nonlinear models. Stat Med. 2010;29(21):2224â€“34.
Gasparrini A, Armstrong B. Reducing and metaanalysing estimates from distributed lag nonlinear models. BMC Med Res Methodol. 2013;13(1):1.
Gasparrini A. Modelling lagged associations in environmental time series data. A Simulation Study Epidemiology. 2016;27(6):835â€“42.
Casella G, Berger RL: Statistical inference, vol. 2: Duxbury Pacific Grove, CA; 2002.
Hoff PD. A first course in Bayesian statistical methods. New York: SpringerVerlag; 2009. doi:10.1007/9780387924076
Gelman A, Hill J. Data analysis using regression and multilevel/hierarchical models. Cambridge University press; 2006. doi:10.1017/CBO9780511790942
Gasparrini A, Armstrong B. Multivariate metaanalysis: a method to summarize nonlinear associations. Stat Med. 2011;30(20):2504â€“6.
Chen Y, Hanson T. Flexible parametrization of variance functions for quantal response data derived from counts. J Biopharm Stat. 2017:1â€“11. doi:10.1080/10543406.2017.1293084
Acknowledgements
Sources of funding for each author.
Yeonseung Chung was supported by the Senior Research grant (2016R1A2B1007082) from the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT (Information and Communication Technologies). Whanhee Lee and Ho kim were supported by the Global Research Lab (K2100400000110A050000710) from the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT (Information and Communication Technologies) and from the Climate Change Correspondence Program (2014001310007) funded by the Korea Ministry of Environment.
Funding
This research is supported by the Senior Research grant (2016R1A2B1007082) and the Global Research Lab (K2100400000110A050000710) from the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT (Information and Communication Technologies) and from Climate Change Correspondence Program (2014001310007) by the Korea Ministry of Environment.
Availability of data and materials
The datasets generated and the codes in the simulation study are available from the first author, Whanhee Lee upon request. The datasets analyzed in the application are available from the fourth and fifth authors, Antonella Zanobetti and Joel Schwartz, upon request.
Author information
Authors and Affiliations
Contributions
WL and YC designed the study. WL implemented the simulation study and the analysisÂ of the US data. SH helped computation. HK helped designing the study and writing the manuscript. AZ and JS provided the US data. WL and YC wrote the manuscript. All authors read and approved the final version of the manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interest
The authors declare that they have no competing interests.
Publisherâ€™s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Additional file
Additional file 1:
Supplemental Materials. (PDF 3700 kb)
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Cite this article
Lee, W., Kim, H., Hwang, S. et al. Monte Carlo simulationbased estimation for the minimum mortality temperature in temperaturemortality association study. BMC Med Res Methodol 17, 137 (2017). https://doi.org/10.1186/s1287401704127
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s1287401704127