 Research article
 Open Access
 Published:
Measurement error in timeseries analysis: a simulation study comparing modelled and monitored data
BMC Medical Research Methodology volume 13, Article number: 136 (2013)
Abstract
Background
Assessing health effects from background exposure to air pollution is often hampered by the sparseness of pollution monitoring networks. However, regional atmospheric chemistrytransport models (CTMs) can provide pollution data with national coverage at fine geographical and temporal resolution. We used statistical simulation to compare the impact on epidemiological timeseries analysis of additive measurement error in sparse monitor data as opposed to geographically and temporally complete model data.
Methods
Statistical simulations were based on a theoretical area of 4 regions each consisting of twentyfive 5 km × 5 km gridsquares. In the context of a 3year Poisson regression timeseries analysis of the association between mortality and a single pollutant, we compared the error impact of using daily gridspecific model data as opposed to daily regional average monitor data. We investigated how this comparison was affected if we changed the number of grids per region containing a monitor. To inform simulations, estimates (e.g. of pollutant means) were obtained from observed monitor data for 2003–2006 for national network sites across the UK and corresponding model data that were generated by the EMEPWRF CTM. Average withinsite correlations between observed monitor and model data were 0.73 and 0.76 for rural and urban daily maximum 8hour ozone respectively, and 0.67 and 0.61 for rural and urban log_{e}(daily 1hour maximum NO_{2}).
Results
When regional averages were based on 5 or 10 monitors per region, health effect estimates exhibited little bias. However, with only 1 monitor per region, the regression coefficient in our timeseries analysis was attenuated by an estimated 6% for urban background ozone, 13% for rural ozone, 29% for urban background log_{e}(NO_{2}) and 38% for rural log_{e}(NO_{2}). For gridspecific model data the corresponding figures were 19%, 22%, 54% and 44% respectively, i.e. similar for rural log_{e}(NO_{2}) but more marked for urban log_{e}(NO_{2}).
Conclusion
Even if correlations between model and monitor data appear reasonably strong, additive classical measurement error in model data may lead to appreciable bias in health effect estimates. As processbased air pollution models become more widely used in epidemiological timeseries analysis, assessments of error impact that include statistical simulation may be useful.
Background
Bias in estimation due to measurement error has received much attention in medical research including epidemiology. In its simplest form i.e. pure additive classical measurement error, the relationship between the observed variable or surrogate measure Z and the “true” variable X ^{*} can be expressed as:
It is well documented that replacing X ^{*} by Z as the explanatory variable in a simple linear regression analysis leads to attenuation in the estimation of both the Pearson correlation coefficient and the gradient of the regression line with the extent of the attenuation depending on the reliability ratio ρ _{ ZX*} where ρ _{ ZX *} = var(X ^{∗})/var(Z) [1]. Similarly in simple Poisson regression pure additive classical error in the explanatory variable leads to attenuation in the estimation of the relative risk [2].
However, not all measurement error is classical. Reeves et al. [3] considered the impact of measurement error in a situation where individual radon exposure was measured with additive classical error but where subjects with missing radon data were assigned an area average. If the variability of “true” individual radon exposure is the same within each area and the area averages are exact (i.e. measured without error) their use as surrogate measures introduces pure additive Berkson error. This type of measurement error has no biasing effect on the regression coefficient in simple linear regression [4] and little if any such effect on the regression coefficient in simple Poisson regression [2, 5]. However if the averages are not exact they introduce a combination of Berkson error and classical error and the presence of additive classical error biases the gradient estimate or relative risk estimate towards the null.
The consequences of using an area average as a surrogate explanatory variable has been investigated in simulations by Lee et al. [6]. Based on a timeseries analysis of daily mortality counts and average daily air pollution (average of readings from available monitors in the study region), they found that increasing the probability of siting monitors in high pollution areas led to attenuation in health effect estimates and poor coverage intervals. They also found that within a separate scenario of high classical instrument error (assumed to be additive on a log scale) and low spatial variation, reducing the total number of monitors in the study region from 30 to 5 enhanced any attenuation in the health effect estimate.
As indicated above, in some circumstances measurement error is proportional (i.e. additive on a log scale) and the relationship of interest is with the untransformed explanatory variable. In the context of using overdispersed Poisson regression to investigate the effects of air pollution on daily emergency department visits, a recent simulation study by Goldman et al. [5] concluded that while pure proportional classical error in the daily air pollution data led to an attenuated estimate of relative risk, pure proportional Berkson error in the pollution data actually led to an inflated estimate of relative risk, i.e. bias away from the null. This is in line with findings for logistic regression from the simulation study of Steenland et al. [7] which suggested that if the Berkson error variance increases as values of the surrogate measure increase, bias in the regression coefficient away from the null may result.
For statistical models containing more than one explanatory variable, the effect of measurement error depends not only on the error type (i.e. Berkson, classical, proportional, additive) but also on the correlation between the explanatory variables, which explanatory variables are causal and which are measured with error. In Poisson regression with two explanatory variables, one causal and measured with pure additive classical error and the other noncausal and measured without error, Fung et al. [8] demonstrated through simulation that the estimated relative risk of the causal variable will be attenuated and that if the correlation between the two explanatory variables is high (i.e. multicollinearity) the predictive effect of the causal variable may be transferred to the noncausal variable.
In air pollution epidemiology shortterm associations between outdoor air pollution and health are assessed using an ecological timeseries design. Many such studies have been published [9] and inform public health policy [10]. These studies correlate daily counts of health events in a specific location (usually a city) with daily pollution concentrations derived from static monitoring sites. Regional air pollution chemistrytransport models (CTMs) that are capable of simulating hourly and daily concentrations of a wide range of pollutants at finescale resolutions (i.e. ≤ 10 km) have recently been developed. These provide new opportunities to investigate pollution metrics (e.g. individual particulate matter components or sourceresolved pollutant metrics) which either cannot be currently measured or can be measured only at a limited number of locations due to their measurement complexity and/or a sparse monitoring network. In this paper we compare, using statistical simulation methods, the performance of geographically resolved model data at 5 km × 5 km resolution with areawide average concentrations derived from a number of air pollution monitors.
We compare performance in terms of additive and proportional measurement error and its effect on a timeseries analysis of the relationship between daily ambient background pollution levels and daily counts of a health event (using allcause mortality as an example) at the small area level, i.e. the 5 km × 5 km grid. The analysis is conducted using Poisson regression. Measures from background air pollution monitors may be available for some 5 km × 5 km grids but not for others. Our primary aim is therefore to demonstrate how simulation techniques can be employed to investigate when and if it might be better to use data from a CTM rather than as often happens in practice, aggregating over grids and using average pollution values based on monitor data. Our simulations are based on a theoretical study area divided into 4 regions each consisting of twentyfive 5 km × 5 km grids and within this construct we consider the effect of varying the number of grids per region containing a monitor. The parameter estimates used in our simulations are taken from observed monitor versus model comparisons.
For the purposes of this investigation we assume that it is the association of ambient pollution with mortality at the smallarea level that is important (because of the link to regulation, [11]) rather than exposure at the level of the individual and leave consideration of disparities between background monitoring networks and personal exposure to others [4, 11, 12]. There is also a literature on impacts of measurement error in air pollution for study designs other than timeseries [13, 14].
Methods
Simulating a “true” timeseries
Simulations were performed using DRAWNORM in STATA 10 [15] and relate to a theoretical square study area measuring 50 km North by 50 km East which can be divided into 4 regions each consisting of twentyfive 5 km × 5 km gridsquares. We assume that:

For a given pollutant its “true” background concentration (i.e. devoid of bias or measurement error) in gridsquare i on day t of a 3year timeseries is:
$${x}_{i,t}^{*}\left(i=1,\dots ,100;t=1,\dots ,1095\right)$$ 
For each grid i, the “true” 3year timeseries, represented by the vector ${X}_{i}^{*}$ , exhibits no trend or seasonal variation.

$${X}_{i}^{\ast}=\phantom{\rule{0.5em}{0ex}}\left[\begin{array}{l}\phantom{\rule{0.5em}{0ex}}{x}_{i,1}^{\ast}\\ \phantom{\rule{2em}{0ex}}\xb7\\ \phantom{\rule{2em}{0ex}}\xb7\\ \phantom{\rule{2em}{0ex}}\xb7\\ {x}_{i,1095}^{*}\phantom{\rule{0.5em}{0ex}}\end{array}\right]$$

Gridspecific means μ _{ i } (i = 1, …, 100) are Normally distributed around an overall mean μ with variance ${\sigma}_{b}^{2}$.
$${\mu}_{i}=\mu +{e}_{i},\phantom{\rule{0.12em}{0ex}}{e}_{i}\sim N\left(0,{\sigma}_{b}^{2}\right)$$ 
Each row of the 1095 × 100 timeseries matrix, ${\mathit{X}}^{*}=\left[{X}_{1}^{*},\dots ,{X}_{100}^{*}\right]$, consists of a sample drawn from a Multivariate Normal distribution, MVN(U, Ω), with gridspecific means, μ _{ i (i = 1,…,100)}, common withingrid variance ${\sigma}_{w}^{2}$, betweengrid covariances σ _{ i,k (i = 1,…,100;k = 1,…,100)} and betweengrid correlation coefficients ρ _{ i,k (i = 1,…,100;k = 1,…,100)}, such that:
$$\mathrm{U}=\phantom{\rule{0.5em}{0ex}}\left[\begin{array}{l}{\mu}_{1}\\ \phantom{\rule{1em}{0ex}}\xb7\\ \phantom{\rule{1em}{0ex}}\xb7\\ \phantom{\rule{1em}{0ex}}\xb7\\ {\mu}_{100}\end{array}\right]\phantom{\rule{0.62em}{0ex}},\phantom{\rule{1em}{0ex}}\mathbf{\Omega}=\left[\begin{array}{ccccc}\hfill {\sigma}_{w}^{2}\hfill & \hfill \xb7\hfill & \hfill \xb7\hfill & \hfill \xb7\hfill & \hfill {\sigma}_{100,1}\hfill \\ \hfill \xb7\hfill & \hfill \xb7\hfill & \hfill \xb7\hfill & \hfill \xb7\hfill & \hfill \xb7\hfill \\ \hfill \xb7\hfill & \hfill \xb7\hfill & \hfill \xb7\hfill & \hfill \xb7\hfill & \hfill \xb7\hfill \\ \hfill \xb7\hfill & \hfill \xb7\hfill & \hfill \xb7\hfill & \hfill \xb7\hfill & \hfill \xb7\hfill \\ \hfill {\sigma}_{1,100}\hfill & \hfill \xb7\hfill & \hfill \xb7\hfill & \hfill \xb7\hfill & \hfill {\sigma}_{w}^{2}\hfill \end{array}\right]$$ 
For each grid i the number of deaths on day t, y _{ i,t }, is sampled from a Poisson distribution with mean dependent on the “true” background concentration of the pollutant in that grid on that day, according to the following formula:
$$E\left({Y}_{i,t}\right)={\phi}_{i,t}=\alpha \times exp\left(\beta \times {x}_{i,t}^{*}\right)$$$${Y}_{i,t}~\mathit{Poisson}\phantom{\rule{0.25em}{0ex}}\left({\phi}_{i,t}\right)$$
We consider two pollutant metrics, daily maximum 8hour ozone and log_{e}(daily 1hour maximum NO_{2}). NO_{2} concentrations are log transformed to take account of a positively skewed distribution.
For ozone we set α = 0.32 (i.e. mean daily deaths with 0 μg/m^{3} ozone = 0.32) and β = 0.0003992 (i. e. e ^{β × 10} = 1.0040). While values assigned to α and β are somewhat arbitrary, a 0.4% increase in mortality per 10 μg/m^{3} increase in ozone, (i. e. β = 0.0003992), is the size of effect that might be observed in a real epidemiological study [9]. For log_{e}(NO_{2}), to aid the comparison of findings in tables, we set α = 0.32 (i.e. mean daily deaths with 1 μg/m^{3} NO_{2} = 0.32) and β = 0.0418845 (i. e. 1.10^{β} = 1.0040 indicating a 0.4% increase in mortality per 10% increase in NO_{2}).
Simulating observed monitor data
Pollution concentrations obtained from monitors will include measurement error due to instrument imprecision and monitor location. Given the small size of grids (i.e. 5 km × 5 km) and that instrument error for an unbiased monitor is generally considered to be classical [16], for each grid i we simulate a 3year timeseries of monitor data, X _{ i }, by adding classical measurement error to our “true” timeseries ${X}_{i}^{*}$ as follows:
where for each element ϵ _{ i,t } of the error vector E _{ i }
such that,
Simulating model data
For each grid i we simulate a 3year timeseries of model data, Z _{ i }, from ${X}_{i}^{*}$. However in contrast to the above we allow for a gridspecific bias (i.e. $E\left({X}_{i}^{*}\right)={\mu}_{i},\phantom{\rule{0.5em}{0ex}}E\left({Z}_{i}\right)={\mu}_{i}+{c}_{i}$, where μ _{ i } and c _{ i } are gridspecific constants) and for the presence of Berksonlike error as well as classicallike error (i.e. we allow for the possibility that, $\mathit{cov}\left({X}_{i}^{*},{Z}_{i}\right)\ne \mathit{var}\left({X}_{i}^{*}\right)$). We do this by using the approach of Reeves et al. [3]. This approach exploits the fact that if we express Z _{ i } as a linear function of ${X}_{i}^{*}$ then using standard theory as outlined in Cox and Hinkley [17]:
where,
, $\mathit{cov}\left({\Delta}_{i},{X}_{i}^{*}\right)=0\phantom{\rule{0.25em}{0ex}}$ and ${\sigma}_{i,z.{x}^{*}}^{2}=\mathit{var}\left({Z}_{i}\right){\left\{\frac{\mathit{cov}\left({X}_{i}^{*},{Z}_{i}\right)}{\mathit{var}\left({X}_{i}^{*}\right)}\right\}}^{2}\mathit{var}\left({X}_{i}^{*}\right)$.
If there is no Berksonlike error (i.e. $\mathit{cov}\left({X}_{i}^{*},{Z}_{i}\right)=\mathit{var}\left({X}_{i}^{*}\right))$ then with the exception of the gridspecific bias term (c _{ i }) formula 1.2 reduces to a classical error model.
In populating 1.2, we assume that model data are uncorrelated with instrument and location error (i.e. cov(Ε_{ i }, Z _{ i }) = 0). From this it follows that $\mathit{cov}\left({X}_{i},{Z}_{i}\right)=\mathit{cov}\left({X}_{i}^{*}+{\mathrm{{\rm E}}}_{i},{Z}_{i}\right)=\mathit{cov}\left({X}_{i}^{*},{Z}_{i}\right)+\mathit{cov}\left({\mathrm{{\rm E}}}_{i},{Z}_{i}\right)=\mathit{cov}\left({X}_{i}^{*},{Z}_{i}\right)$. In addition, provided our focus is on the effects of additive measurement error, (not the case for proportional measurement error), and our timeseries analysis adjusts for grid, we can simplify calculations by setting the gridspecific constant terms c _{ i } = c for all i = 1, …, 100.
For the purposes of our simulations involving proportional error we ignore any dependence between $E\left({Z}_{i}\right)E\left({X}_{i}^{*}\right)$ and $E\left({X}_{i}^{*}\right)$ and assume that:
Simulating regional averages
We simulate the use of regional averages in situations where pollution monitor coverage is less than 1 monitor per 5 km × 5 km grid by first sampling a subset of l gridsquares (R _{ jl }(j = 1, …, 4)) from each of the 4 regional sets of 25 gridssquares (R _{ j }(j = 1, …, 4)) such that R _{ jl } ⊂ R _{ j }. Next we replace each 3year timeseries, X _{ i }(i ∊ R _{ j } ) with a 3year timeseries of averages W _{ j } based on the formula:
Simulated regional average timeseries are produced in this way for l = 5, l = 10, l = 15, l = 20, l = 25.
We also consider the single monitor scenario i.e. l = 1.
Comparison of observed monitor and CTM data
Realistic estimates for the above as yet unset parameters (e.g. $\phantom{\rule{0.25em}{0ex}}{\sigma}_{b}^{2},\mathit{var}\left({Z}_{i}\right)$) were obtained by reference to observed monitor and chemistrytransport model (CTM) data. The monitor data came from the UK’s Automatic Urban and Rural Network (AURN) and were obtained via the UK national air information resource [18].
The modelled data used were daily outputs from the EMEPWRF v3.7 gridbased (Eulerian) 3D CTM which provides a detailed simulation of the evolving physical and chemical state of the atmosphere over the UK. The underlying CTM is the EMEP Unified Model [19] which has been modified to enable application at 5 km horizontal spatial resolution over the British Isles [20]. A nested approach is used whereby EMEP simulations of atmospheric composition across a coarser European domain are used to drive finescale EMEPWRF simulations of air quality at 5 km horizontal resolution across the UK. The EMEP and EMEPWRF models have been extensively validated and used for numerous policy applications [21, 22].
Daily concentrations of monitored ozone (μg/m^{3}) and their corresponding EMEPWRF CTM estimates, covering a total of at least 364 days over the period 2003–2006, were obtained for 35 urban background and 21 rural monitoring sites across England, Wales, Scotland and Northern Ireland. Similarly paired daily concentrations of NO_{2} (μg/m^{3}), again covering at least 364 days over the period 2003–2006, were obtained for 43 urban background and 14 rural monitoring sites across, England, Wales, Scotland and Northern Ireland. Ozone concentrations were daily maximum running 8hour mean and NO_{2} concentrations were log_{e}transformed (daily 1hour maximum). Summary statistics comparing monitor and CTM data for rural and urban sites are presented in Table 1.
The distance between each pair of monitoring sites of the same type was calculated. Then having first standardised monitored pollution concentrations within site by subtracting the site mean and dividing by the site standard deviation, Pearson correlations across time between site pairs were calculated for rural ozone, urban background ozone, rural log_{e}(NO_{2}) and urban background log_{e}(NO_{2}) and plotted against distance (Figure 1). Correlations based on <364 paired observations were set to missing. The relationships between Pearson correlation and distance were then investigated using simple linear regression.
Parameter estimates
To simulate “true” urban background ozone concentrations for our theoretical study area we set μ = 61.73 and ${\sigma}_{b}^{2}={7.38}^{2}$ (Table 1), and constructed a correlation matrix ρ(100,100) using the regression equation based on Pearson correlation as a function of distance between monitors (Figure 1(a)):
Each offdiagonal element of ρ was calculated by setting D equal to an estimate of the average distance in km between any two points, one in each of the two 5 km × 5 km gridsquares being compared (using simulation: $D\approx d+2.13\times \left(\frac{1}{d}\right)$ where d is the straight line distance between the centre points of the two gridsquares). The diagonal elements were calculated by setting D equal to an estimate of the average distance between any two points within a 5 km × 5 km gridsquare (using simulation: D ≈ 2.6 ). The variance/covariance matrix Ω(100, 100) was then obtained by multiplying each element of ρ by the average observed withinsite variance (i.e. 25.28^{2} in Table 1). This produced a symmetrical matrix with diagonal elements equal to 24.35^{2}, the estimated average “true” withinsite variance having removed any variation due to instrument error and monitorsite location error (i.e. $\phantom{\rule{0.25em}{0ex}}{\sigma}_{\mathit{err}}^{2})$.
For simulating observed monitor data we set σ _{ err } = 6.77 (see Additional file 1) and for simulating model data within each grid i we set: $\mathit{cov}\left({X}_{i}^{*},{Z}_{i}\right)=455.78,\mathit{var}\left({X}_{i}^{*}\right)={24.35}^{2},$ var(Z _{ i }) = 23.41^{2}, and c = 10.25 (see Table 1). Parameter estimates for rural ozone, urban background log_{e}(NO_{2}) and rural log_{e}(NO_{2}) were obtained in the same fashion.
Proportional measurement error
For NO_{2} we have assumed that measurement error is additive on a log scale and that the relationship of interest is with log_{e}(NO_{2}). If, however, the relationship of interest is with NO_{2} (untransformed) then measurement error in the explanatory variable is proportional rather than additive. In order to simulate monitor NO_{2} data with proportional error, we first simulate log_{e}(NO_{2}) as before but then backtransform (i.e. NO_{2} = exp(log_{e}(NO_{2})) prior to calculating regional averages. For model data, we first simulate log_{e}(NO_{2}) as in Equation (1.2) but instead of setting the c _{ i } = c, we use Equation (1.2a) and set σ _{ diff } = 0.268 for urban background log_{e}(NO_{2}) and σ _{ diff } = 0.210 for rural log_{e}(NO_{2}) (see Table 1). The data are then backtransformed. With NO_{2} rather than log_{e}(NO_{2}) as the explanatory variable in our epidemiological timeseries analysis we set: α = 0.32 and β = 0.0003992 (i. e. e ^{β × 10} = 1.0040 indicating a 0.4% increase in mortality per 10 μg/m^{3} increase in NO_{2}).
Statistical analysis of simulated time series
For each of the 7 timeseries scenarios considered in each of Tables 2, 3 and 4, 1000 simulated data sets were produced and each analysed separately using Poisson regression with grid as a fixed effect. As a result, 1000 separate estimates of both the health effect ($\widehat{\beta}$) and its standard error, $\mathit{SE}\left(\widehat{\beta}\right)$, were obtained. Statistics presented in Tables 2, 3 and 4 include estimate averages and estimates of the coverage probability and power. An estimate of coverage probability records the percentage of simulations where the 95% confidence interval contains the “true” value of β and an estimate of power records the percentage of simulations that would have detected the health effect estimate as statistically significant at the 5% significance level.
Finally using established theory (See Additional file 2) we obtained predictions of the attenuation in β that we might expect from using CTM data or data from a single monitor per region. These predictions were then compared to the corresponding results obtained from our simulations.
Error decomposition
In order to aid interpretation of our simulation results for the CTM data, we decomposed the gridspecific error variance $\mathit{var}\left({Z}_{i}{X}_{i}^{*}\right)$ into two components, a classicallike component (CC), and a Berksonlike component (BC) as follows:
where
and
Estimates of CC and BC were then obtained using the observed data (See Additional file 3 for further details and calculations).
Results
Comparing “true” values of the regression coefficient, β, (e.g. β × 10 = 0.00399 for urban background ozone) with those based on simulated data, $\left(\widehat{\beta}\right)$, Tables 2 and 3 suggest that the use of regional average monitor data as a surrogate for gridspecific “true” ambient concentrations has limited impact on health effect estimates unless the number of monitors per 25 km × 25 km gridsquare falls below 3 (or possibly 5 in the case of rural log_{e}(NO_{2})). The monitoring scenario which produced the largest bias in the health effect for all four pollutants was that of a single monitor per 25 km × 25 km gridsquare. The regression coefficient was attenuated by an estimated 6% for urban ozone, 13% for rural ozone, 29% for urban log_{e}(NO_{2}) and 38% for rural log_{e}(NO_{2}). By contrast when we used gridspecific model data, the regression coefficient was attenuated by an estimated 19% for urban ozone, 22% for rural ozone, 54% for urban log_{e}(NO_{2}) and 44% for rural log_{e}(NO_{2}). Thus, although for rural log_{e}(NO_{2}) results were similar to those of the 1 monitor per region scenario, for urban and rural ozone, urban log_{e}(NO_{2}) and for less sparse monitoring networks the use of model rather than monitor data appeared to produce a more marked level of bias in the health effect estimate. Comparison of the “true” values of the regression coefficient with those based on simulated “true” data (Tables 2 and 3) suggests that our findings are not simply due to an inadequate number of simulations.
Of particular note are the small coverage probabilities for log_{e}(NO_{2}), especially when using the gridspecific model data, but also evident when using measured rural data from a single monitor within each 25 km × 25 km grid. These suggest that not only is there marked attenuation in the health effect estimate but that bias extends to the standard errors, such that few simulations produced a 95% confidence interval containing the “true” value of β (only 15% for urban background modelled log_{e}(NO_{2}) and 11% for rural modelled log_{e}(NO_{2}) (Tables 2 and 3). As expected statistical power for log_{e}(NO_{2}) is consistently higher than for ozone as the magnitude of the “true” effect to be detected is larger (i.e. a 0.4% increase in mortality per 10% increase in NO_{2} versus a 0.4% increase in mortality per 10 μg/m^{3} in ozone). Nevertheless, the use of gridspecific model data for urban and rural ozone and the use of either model or 1 monitor per region data for urban log_{e}(NO_{2}) appears to have a slightly adverse effect on power.
Table 4 presents results for NO_{2} assuming proportional measurement error (i.e. additive on a log scale) but where the relationship of interest is with the untransformed variable. Overall, compared to log_{e}(NO_{2}), powerloss due to measurement error was similar but coverage probabilities, particularly for model data, improved. Model data and the single monitor scenario registered the largest attenuation in the regression coefficient, but there was noticeable attenuation even when using regional averages based on 5 monitors per 25 km × 25 km region.
Predictions from theory
For model data and for the 1 monitor scenario, established theory (see Additional file 2) allows us to predict the effects of additive measurement error on the health effect estimate. Table 5 illustrates that estimates of attenuation in β obtained by simulation are not that dissimilar from those obtained using standard theory in this simple case.
Discussion
In the context of a timeseries analysis of the association between daily concentration of air pollution and mortality, our study used simulation as a technique to contrast the effects on the estimation of that association of using gridspecific pollution data derived from a 3D chemistrytransport model as opposed to regional average air pollution concentrations derived from monitors. Pollution concentrations were simulated both with (i.e. monitor data), and without (i.e. “true” data) classical “instrument and monitorlocation” error. The “true” data were then used in the statistical simulation of model data with the inclusion of both classical and Berksonlike error. The parameter estimates driving our simulations were based on both monitor and CTM daily maximum 8hour mean ozone data for 35 urban background and 21 rural monitoring sites across the UK and on both monitor and CTM log_{e}(daily maximum 1hour NO_{2}) data for 43 urban background and 14 rural monitoring sites across the UK. Withingrid correlations between observed monitor and CTM data were relatively strong with average correlation coefficients of 0.73 for rural ozone, 0.76 for urban ozone, 0.67 for rural log_{e}(NO_{2}) and 0.61 for urban log_{e}(NO_{2}). The lower correlations for log_{e}(NO_{2}) were likely a consequence of the shorter averaging time of the NO_{2} metric (i.e. 1hour rather than 8hour for ozone).
For both pollutants (i.e. ozone and log_{e}(NO_{2})), the use of a single monitor to provide estimated pollution concentrations for every 5 km × 5 km grid within a 25 km × 25 km region produced attenuated health effect estimates. This attenuation was less marked for the more spatially homogeneous longlived pollutant ozone, for which the short distance correlations in Figure 1 were strong, than for the shortlived pollutant log_{e}(NO_{2}) for which the short distance correlations were considerably weaker. However for other scenarios, particularly those based on 5 or 10 monitors, the use of regional averages with additive rather than proportional error had little effect on health effect estimates. This concurs with the simulation findings of Sheppard et al. [12] who reported a “small but noticeable” attenuation in the heath effect estimate when ambient area exposure to PM_{2.5} was based on a single pollution monitor, but little if any attenuation when area exposure was based on the average across 3 or 10 monitors.
Goldman et al. [16] recognized that a large proportion of the measurement error introduced by the use of average monitor concentrations is due to spatial variation and suggests that such error is predominantly Berkson, which, while reducing statistical power, will not on its own lead to bias in health effect estimates. However as classical error is introduced, occurring as we introduce instrument error and monitorsite location error into our simulations and reduce the number of monitors on which averages are based, attenuation in the health effect estimate is observed. This is more pronounced for log_{e}(NO_{2}), particularly rural log_{e}(NO_{2}) than for ozone. This suggests, in line with the findings of others, that attenuation of the relative risk depends not only on instrument error but on the number and placement of monitors [6, 16, 23] and on the level of spatial variation [6, 23, 24]. As suggested by Goldman et al. [16], it may be the combination of these sources which determine the ultimate effect on relative risk estimates.
The combined effects of different error sources may also help to explain why contrary to expectation we found no evidence in Tables 2 and 3 (i.e. additive measurement error) of any reduction in statistical power from the use of regional average monitor data based on 2, 3, 5 or 10 monitors per region, with any loss of power most noticeable for the 1 monitor scenario in particular in relation to urban log_{e}(NO_{2}).
The use of simulated model data produced attenuation in the health effect estimate, which for rural log_{e}(NO_{2}) was similar to that associated with the scenario of a single regional monitor. However for urban and rural ozone and particularly urban log_{e}(NO_{2}) regression coefficients were more biased towards the null than for the single monitor case. According to Sheppard et al. [25] classical error can result not only in an attenuated health effect estimate but also lead to a downward bias in the estimation of standard errors and thus to inaccuracy in the coverage of 95% confidence intervals. The appreciable bias in health effect estimates and coverage intervals based on simulated model data for log_{e}(NO_{2}) therefore implies the presence of predominantly classical rather than Berksonlike error in EMEPWRF CTM estimates of this pollution metric. In order to investigate this further we attempted using our comparison dataset to decompose random measurement error into its classicallike and Berksonlike components (Additional file 3). Our results suggested that indeed classical error predominates overwhelmingly in the log_{e}(NO_{2}) CTM data.
The use of NO_{2} rather than log_{e}(NO_{2}) (i.e. proportional rather than additive measurement error) appeared to lead to a marked improvement in the previously poor coverage probabilities of the model data but further attenuation in health effect estimates based on regional averages. However these regional averages still tended to outperform model data with the possible exception of the 1 monitor per 25 km × 25 km grid square scenario for rural NO_{2} where monitor and model findings were comparable. Unlike additive measurement error whose biasing effect on grid means is effectively adjusted for by including grid as a fixed effect in our timeseries analyses, this is not the case when measurement error is proportional. For model data with proportional error therefore it is important to note that our findings may depend to some extent on gridspecific mean pollution levels and the validity of the assumptions we make in simulating them (see Equation 1.2a).
One of the strengths of our simulation approach is that it allows the correlation between timeseries in different grids to vary according to the distance between those grids. However, in so doing we make the assumption that spatial dependence is characterised by a single linear function. In our regression analysis of the association between correlation and distance (Figure 1) the addition of a quadratic term was statistically significant for urban and rural ozone and for urban log_{e}(NO_{2}), although for all three pollutants the incorporation of this nonlinearity had a relatively small impact on the percentage of variance explained (explaining an additional 0.2, 1.6 and 1.6 percentage points respectively). We also assume that spatial dependence is independent of direction (i.e. isotropic) and geography (other than a distinction between urban and rural) and does not vary over time. This may not be the case if the study area contains point sources, the outflow from which may vary in direction, with direction varying itself over time due to changing weather conditions. Nevertheless this is an assumption employed by other authors [5, 23] in this field, possibly due to the fact that data sufficient to incorporate such features into simulation studies is not readily available or generalizable.
Our simulations allow mean pollution concentrations to vary between grids although we assume that they vary at random and do not take account of the fact that mean pollution concentrations in nearest neighbour grids may be more similar than those at a distance. This could have implications for our results involving proportional measurement error. However, when for each pair of monitoring stations in our observed monitor data set we plotted the absolute difference in site means against distance there was no evidence of a linear relationship whether for log_{e}(NO_{2}) or ozone, urban or rural. Though in some ways reassuring, these findings may nevertheless be insensitive to differences in gridmean pollution concentrations within urban areas, where for example background levels of NO_{2} tend to increase as one approaches the urban core [26], whilst background levels of O_{3} tend to decrease.
A further limitation is that we use the same variance to generate each withingrid timeseries and that timeseries, both modelled and monitored, are simulated without seasonal pattern or trend. Hence we do not consider the influence of timedependent confounding variables nor other confounders or pollutants. However the effects of measurement error in multipollutant models [4, 27] and in the presence of confounders have been considered by others [25, 28].
Although quantitatively the simulation parameters we used (and hence our results) only apply to the EMEPWRF model v3.7 for the British Isles, the simulation approach is generalizable and may be used in the evaluation of other chemistrytransport models in other areas.
Eulerian CTMs similar to the EMEP model discretize the real world using a fixed horizontal and vertical grid with no explicit information of withingrid variability of emissions. Linear emissions such as roads and/or point sources are averaged to the CTM horizontal resolution. This approximation may limit the model ability to resolve the near sources chemistry and transport which is likely to occur near urban monitor sites. Moreover, the EMEP model was not designed to replicate the complex urban environment. Local dispersion models which can represent the finescale complexity of an urban environment are currently available (ADMS, ERG models), however they are very computationally expensive and are limited to specific areas and rely on CTMs for boundary condition in order to capture the regional import/export of pollutants.
The benefit of full temporal and UK coverage and the selfconsistency of predicted chemicals parameters should not be underestimated, and perhaps this benefit overcomes the shortage of properly representing the surface urban chemistry.
Our present findings suggest that there may be an appreciable penalty of using CTM data in spatiallyresolved epidemiological timeseries studies, which for some pollutants in part weighs against the substantial benefits of such modelled data. These advantages include the opportunity to investigate pollutants (e.g. different particle measures) with sparse or zero monitor coverage, or pollutants from specific sources with direct relevance to policy formulation and evaluation, or the potential consequences from alternative future scenarios. For the simulations incorporating additive measurement error (Tables 2 and 3) and the input data used in this work, we found that monitor data outperformed model data in urban areas and in areas with at least 2 monitors per 25 km × 25 km gridsquare but that the performance of monitor and model data for log_{e}(NO_{2}), at least in terms of power and attenuation in the regression coefficient, was similar in rural areas with only 1 monitor per 25 km × 25 km gridsquare. However, it is important to be clear that the impact of ‘measurement’ error as assessed in this paper is only one aspect of data performance relevant to the use of modelled versus monitored data in epidemiological studies, and that monitored data themselves, typically characterised by sparse data from preferential (similar type) locations with measurement errors and often missing values, also have their limitations which are often ignored. High resolution CTMs are continually being developed and our study suggests that further assessment of model error impact  which includes statistical simulation – as well as improved understanding of the performance of monitored data, would be useful.
Conclusions
Even if correlations between model and monitor data appear reasonably strong, additive classical measurement error in model data may lead to appreciable bias in health effect estimates. As processbased air pollution models become more widely used in epidemiological timeseries analysis because of their advantages in terms of geographical coverage and their potential to provide complete timeseries for all pollutant species of interest, assessments of error impact which include statistical simulation may be useful.
Abbreviations
 CTM:

Chemistrytransport model.
References
 1.
Liu K, Stamler J, Dyer A, McKeever J, McKeever P: Statistical methods to assess and minimise the role of intraindividual variability in obscuring the relationship between dietary lipids and serum cholesterol. J Chron Dis. 1978, 31: 399418. 10.1016/00219681(78)900048.
 2.
Armstrong B: Effect of measurement error on epidemiological studies of environmental and occupational exposures. Occup Environ Med. 1998, 55: 651656. 10.1136/oem.55.10.651.
 3.
Reeves GK, Cox DR, Darby SC, Whitley E: Some aspects of measurement error in explanatory variables for continuous and binary regression models. Statist Med. 1998, 17: 21572177. 10.1002/(SICI)10970258(19981015)17:19<2157::AIDSIM916>3.0.CO;2F.
 4.
Zeger SL, Thomas D, Dominici F, Samet JM, Schwartz J, Dockery D, Cohen A: Exposure measurement error in timeseries studies of air pollution: concepts and consequences. Environ Health Perspect. 2000, 108: 419426. 10.1289/ehp.00108419.
 5.
Goldman GT, Mulholland JA, Russell AG, Strickland MJ, Klein M, Waller LA, Tolbert PE: Impact of exposure measurement error in air pollution epidemiology: effect of error type in timeseries studies. Environ Health. 2011, 10: 6171. 10.1186/1476069X1061.
 6.
Lee D, Shaddick G: Spatial modelling of air pollution in studies of its shortterm health effects. Biometrics. 2010, 66: 12381246. 10.1111/j.15410420.2009.01376.x.
 7.
Steenland K, Deddens JA, Zhao S: Biases in estimating the effect of cumulative exposure in loglinear models when estimated exposure levels are assigned. Scand J Work Environ Health. 2000, 26: 3743. 10.5271/sjweh.508.
 8.
Fung KY, Krewski D: On measurement error adjustment methods in Poisson regression. Environmetrics. 1999, 10: 213224. 10.1002/(SICI)1099095X(199903/04)10:2<213::AIDENV349>3.0.CO;2B.
 9.
Anderson HR, Atkinson RW, Bremner SA, Carrington J, Peacock J: Quantitative systematic review of short term associations between ambient air pollution (particulate matter, ozone, nitrogen dioxide, sulphur dioxide and carbon monoxide), and mortality and morbidity. 2007, Report to Department of Health revised following first review, http://www.dh.gov.uk/en/Publicationsandstatistics/Publications/PublicationsPolicyAndGuidance/DH_121200,
 10.
World Health Organisation: Air quality guidelines: global update 2005. Particulate matter, ozone, nitrogen dioxide and sulphur dioxide. 2006, Copenhagen: WHO Regional Office for Europe, http://www.euro.who.int/en/whatwedo/healthtopics/environmentandhealth/airquality/publications/pre2009/airqualityguidelines.globalupdate2005.particulatematter,ozone,nitrogendioxideandsulfurdioxide,
 11.
Dominici F, Zeger SL, Samet JM: A measurement error model for timeseries studies of air pollution and mortality. Biostatistics. 2000, 1: 157175. 10.1093/biostatistics/1.2.157.
 12.
Sheppard L, Slaughter JC, Schildcrout J, Liu LJS, Lumley T: Exposure and measurement contributions to estimates of acute air pollution effects. J Expos Anal Environ Epidemiol. 2005, 15: 366376. 10.1038/sj.jea.7500413.
 13.
Szpiro AA, Paciorek CJ, Sheppard L: Does more accurate exposure prediction necessarily improve health effect estimates?. Epidemiology. 2011, 22: 680685. 10.1097/EDE.0b013e3182254cc6.
 14.
Szpiro AA, Sheppard L, Lumley T: Efficient measurement error correction with spatially misaligned data. Biostatistics. 2011, 12: 610623. 10.1093/biostatistics/kxq083.
 15.
StataCorp: Stata Statistical Software: Release 10. 2007, College Station, TX: StataCorp LP
 16.
Goldman GT, Mulholland JA, Russell AG, Gass K, Strickland MJ, Tolbert PE: Characterisation of ambient air pollution measurement error in a timeseries health study using a geostatistical simulation approach. Atmos Environ. 2012, 57: 101108.
 17.
Cox DR, Hinkley DV: Appendix 3 Secondorder regression for arbitrary random variables. Theoretical Statistics. 1974, London: Chapman and Hall, 475477.
 18.
Automatic Urban and Rural Monitoring Network (AURN) Data Archive: Automatic Urban and Rural Monitoring Network (AURN) Data Archive. http://ukair.defra.gov.uk,
 19.
Simpson D, Benedictow A, Berge H, Bergström R, Emberson LD, Fagerli H, Flechard CR, Hayman GD, Gauss M, Jonson JE, Jenkin ME, Nyíri A, Richter C, Semeena VS, Tsyro S, Tuovinen JP, Valdebenito Á, Wind P: The EMEP MSCW chemical transport model  technical description. Atmos Chem Phys. 2012, 12: 78257865. 10.5194/acp1278252012.
 20.
Vieno M, Dore AJ, Stevenson DS, Doherty R, Heal MR, Reis S, Hallsworth S, Tarrason L, Wind P, Fowler D, Simpson D, Sutton MA: Modelling surface ozone during the 2003 heatwave in the UK. Atmos Chem Phys. 2010, 10: 79637978. 10.5194/acp1079632010.
 21.
Carslaw D: Defra regional and transboundary model evaluation analysis  phase 1, a report for Defra and the Devolved Administrations. 2011, http://ukair.defra.gov.uk/reports/cat20/1105091514_RegionalFinal.pdf,
 22.
Fagerli H, Gauss M, Benedictow A, Griesfeller J, Jonson JE, Nyíri Á, Schulz M, Simpson D, Steensen BM, Tsyro S, Valdebenito Á, Wind P, Aas W, Hjellbrekke AG, Mareckova K, Wankmüller R, Iversen T, Kirkevåg A, Seland Ø, Vieno M: Transboundary acidification, eutrophication and ground level ozone in Europe in 2009. EMEP Status Report 1/2011. 2011, Oslo: Norwegian Meteorological Institute
 23.
Peng RD, Bell ML: Spatial misalignment in time series studies of air pollution and health data. Biostatistics. 2010, 11: 720740. 10.1093/biostatistics/kxq017.
 24.
Kim SY, Sheppard L, Kim H: Health effects of longterm air pollution: influence of exposure prediction methods. Epidemiology. 2009, 20: 442450. 10.1097/EDE.0b013e31819e4331.
 25.
Sheppard L, Burnett RT, Szpiro AA, Kim SY, Jerrett M, Pope CA, Brunekreef B: Confounding and exposure measurement error in air pollution epidemiology. Air Qual Atmos Health. 2012, 5: 203216. 10.1007/s1186901101409.
 26.
Strickland MJ, Darrow LA, Mulholland JA, Klein M, Flanders WD, Winquist A, Tolbert PE: Implications of different approaches for characterizing ambient air pollutant concentrations within the urban airshed for timeseries studies and health benefit analyses. Environ Health. 2011, 10: 3644. 10.1186/1476069X1036.
 27.
Carrothers TJ, Evans JS: Assessing the impact of differential measurement error on estimates of fine particle mortality. J Air Waste Manage Assoc. 2000, 50: 6574. 10.1080/10473289.2000.10463988.
 28.
Carroll RJ, Gallo PP, Glesser LJ: Comparison of least squares and errorsinvariables regression, with special reference to randomised analysis of covariance. J Am Stat Assoc. 1985, 80: 929932. 10.1080/01621459.1985.10478206.
Prepublication history
The prepublication history for this paper can be accessed here:http://www.biomedcentral.com/14712288/13/136/prepub
Acknowledgements
The authors would like to thank Zaid Chalabi (London School of Hygiene and Tropical Medicine) for reading the article in draft, picking up errors and making helpful suggestions which contributed to the intellectual content. The article was produced as part of the AWESOME project which is funded by a grant from the Natural Environment Research Council (NERC, NE/I007938/1). The NERC grant includes full funding for BKBs current post at St George’s, University of London. We would also like to acknowledge use of monitor data from the UK Department for Environment, Food and Rural Affairs Automatic Urban and Rural Network (AURN) which is public sector information licenced under the Open Government Licence v1.0 [http://www.nationalarchives.gov.uk/doc/opengovernmentlicence/].
Author information
Affiliations
Corresponding author
Additional information
Competing interests
MRH, RMD and MV have an academic interest in the EMEPWRF CTM and its development. There are no other conflicts of interest.
Authors’ contributions
BKB contributed to the design of the study, analysed the data, carried out the simulations and took the lead in drafting the paper. BA provided theoretical statistical expertise and contributed to the design and concept of the study. RWA and PW contributed to the design and concept of the study. MRH and RMD assembled the model data and the modelmonitor comparison data sets. MV is the main developer of the EMEPWRF regional chemistrytransport model and produced the model output. All authors contributed to the drafting of the paper, the interpretation of results and read and approved the final manuscript.
Electronic supplementary material
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Rights and permissions
About this article
Cite this article
Butland, B.K., Armstrong, B., Atkinson, R.W. et al. Measurement error in timeseries analysis: a simulation study comparing modelled and monitored data. BMC Med Res Methodol 13, 136 (2013). https://doi.org/10.1186/1471228813136
Received:
Accepted:
Published:
Keywords
 Measurement error
 Epidemiology
 Timeseries
 Mortality
 Nitrogen dioxide
 Ozone