Skip to main content
  • Technical advance
  • Open access
  • Published:

Simple estimators of the intensity of seasonal occurrence

Abstract

Background

Edwards's method is a widely used approach for fitting a sine curve to a time-series of monthly frequencies. From this fitted curve, estimates of the seasonal intensity of occurrence (i.e., peak-to-low ratio of the fitted curve) can be generated.

Methods

We discuss various approaches to the estimation of seasonal intensity assuming Edwards's periodic model, including maximum likelihood estimation (MLE), least squares, weighted least squares, and a new closed-form estimator based on a second-order moment statistic and non-transformed data. Through an extensive Monte Carlo simulation study, we compare the finite sample performance characteristics of the estimators discussed in this paper. Finally, all estimators and confidence interval procedures discussed are compared in a re-analysis of data on the seasonality of monocytic leukemia.

Results

We find that Edwards's estimator is substantially biased, particularly for small numbers of events and very large or small amounts of seasonality. For the common setting of rare events and moderate seasonality, the new estimator proposed in this paper yields less finite sample bias and better mean squared error than either the MLE or weighted least squares. For large studies and strong seasonality, MLE or weighted least squares appears to be the optimal analytic method among those considered.

Conclusion

Edwards's estimator of the seasonal relative risk can exhibit substantial finite sample bias. The alternative estimators considered in this paper should be preferred.

Peer Review reports

Background

In a classic paper, Edwards [1] describes a geometrically motivated, moment-based method to fit a sine curve to a time series of square-root transformed monthly frequencies. From this basic framework, he derived both a test of the null hypothesis of no seasonality and an estimator of the intensity of seasonal occurrence (i.e., the peak-to-low ratio of the fitted sine curve). Owing to its intuitive appeal and computational simplicity, Edwards's and related methods have been widely used in epidemiology in studies of seasonality, e.g., [2–7].

Although there has been considerable discussion of the hypothesis testing procedure described by Edwards and a variety of alternative tests have been proposed [8–12], there has been relatively little discussion of the properties of Edwards's estimator of the intensity of seasonal occurrence. St. Leger discusses some computational difficulties involved with maximum likelihood estimation of the parameters in Edwards's model [13]. Nam compared the performance of the MLE with a moment-based "locally reasonable" estimator, similar to Edwards's estimator, and concluded that the MLE was preferable when the seasonal trend was strong [14].

In this paper, we review various approaches to the estimation of the intensity of seasonal occurrence, including Edwards's methods, least squares, weighted least squares, and the MLE. We then propose a new closed-form moment estimator of the peak-to-low ratio based on non-transformed data and a second-order moment statistic. Through an extensive Monte-Carlo simulation study, we compare the finite sample performance of the estimators discussed in this paper across a variety of data generating distributions, including some that involve overdispersion and autocorrelation of the outcome and thus depart from the assumed model. All estimators and confidence interval procedures discussed in this paper are applied in a reanalysis of data on the seasonal incidence of monocytic leukemia.

Methods

Data and Probability Model

Edwards's approach is used to study the seasonality of rare events that arise from an underlying non-homogeneous Poisson process with a rate given by the periodic function

λ(t) = μ{1 + αcos(2πt + θ)},

where μ is the total number of expected events in the year, t is the time in years, θ is the phase angle, and α is the hemi-amplitude of the periodic process.

We consider the situation in which the year is divided into k equally-sized intervals and aggregate data are available on the frequency of events occurring in each interval across T years. We denote the observed frequencies with N i , i = 1, ..., k.

Edwards's probability model for these data is a discrete approximation to the non-homogeneous Poisson process and models the observed counts as independent Poisson random variables with mean given by the periodic function

m i = n k { 1 + α cos [ 2 π k ( i − ϕ − 0.5 ) ] } , MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemyBa02aaSbaaSqaaiabdMgaPbqabaGccqGH9aqpjuaGdaWcaaqaaiabd6gaUbqaaiabdUgaRbaakmaacmaabaGaeGymaeJaey4kaSIaeqySdeMagi4yamMaei4Ba8Maei4CamNaei4waSvcfa4aaSaaaeaacqaIYaGmcqaHapaCaeaacqWGRbWAaaGccqGGOaakcqWGPbqAcqGHsislcqaHvpGzcqGHsislcqaIWaamcqGGUaGlcqaI1aqncqGGPaqkcqGGDbqxaiaawUhacaGL9baacqGGSaalaaa@4EDE@
(1)

where i is the interval (e.g., quarter, month, week), and φ + 0.5 is the time of peak incidence. The parameter n is the total expected number of events across all years, i.e., n = μT.

In this paper, we focus on the estimation of the peak-to-low ratio of the process, also termed the intensity of seasonal occurrence or seasonal relative risk, and is given by

R = 1 + α 1 − α . MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemOuaiLaeyypa0tcfa4aaSaaaeaacqaIXaqmcqGHRaWkcqaHXoqyaeaacqaIXaqmcqGHsislcqaHXoqyaaGccqGGUaGlaaa@36CF@

Edwards's Method

Edwards derives an estimator for α by first computing the distance from the origin to a re-scaled center of gravity of k point masses of weight N i MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaWaaOaaaeaacqWGobGtdaWgaaWcbaGaemyAaKgabeaaaeqaaaaa@2E91@ placed on the rim of a unit circle at angles θ i = 2πi/k, i = 1, ..., k. Using a first-order Taylor series expansion he derives an expected value for this quantity that depends on the true α. Setting the distance from the origin to the center of gravity equal to its expected distance and solving for α, Edwards derives a moment-based estimator for a given by

α ^ E = 4 ( ∑ i = 1 k N i sin ( θ i ) ) 2 + ( ∑ i = 1 k N i cos ( θ i ) ) 2 ∑ i = 1 k N i MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafqySdeMbaKaadaWgaaWcbaGaemyraueabeaakiabg2da9KqbaoaalaaabaGaeGinaqZaaOaaaeaadaqadaqaamaaqadabaWaaOaaaeaacqWGobGtdaWgaaqaaiabdMgaPbqabaaabeaaaeaacqWGPbqAcqGH9aqpcqaIXaqmaeaacqWGRbWAaiabggHiLdGagi4CamNaeiyAaKMaeiOBa4MaeiikaGIaeqiUde3aaSbaaeaacqWGPbqAaeqaaiabcMcaPaGaayjkaiaawMcaamaaCaaabeqaaiabikdaYaaacqGHRaWkdaqadaqaamaaqadabaWaaOaaaeaacqWGobGtdaWgaaqaaiabdMgaPbqabaaabeaaaeaacqWGPbqAcqGH9aqpcqaIXaqmaeaacqWGRbWAaiabggHiLdGagi4yamMaei4Ba8Maei4CamNaeiikaGIaeqiUde3aaSbaaeaacqWGPbqAaeqaaiabcMcaPaGaayjkaiaawMcaamaaCaaabeqaaiabikdaYaaaaeqaaaqaamaaqadabaWaaOaaaeaacqWGobGtdaWgaaqaaiabdMgaPbqabaaabeaaaeaacqWGPbqAcqGH9aqpcqaIXaqmaeaacqWGRbWAaiabggHiLdaaaaaa@65DC@

Using the fact that the variance of N i MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaWaaOaaaeaacqWGobGtdaWgaaWcbaGaemyAaKgabeaaaeqaaaaa@2E91@ is approximately 1 4 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaqcfa4aaSaaaeaacqaIXaqmaeaacqaI0aanaaaaaa@2E59@ , he shows that the approximate variance for α ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafqySdeMbaKaaaaa@2D84@ E is 2/N where N = ΣN i . Edwards estimates R by replacing α with α ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafqySdeMbaKaaaaa@2D84@ E , i.e.,

R ^ E = 1 + α ^ E 1 − α ^ E . MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaadaWgaaWcbaGaemyraueabeaakiabg2da9KqbaoaalaaabaGaeGymaeJaey4kaSIafqySdeMbaKaadaWgaaqaaiabdweafbqabaaabaGaeGymaeJaeyOeI0IafqySdeMbaKaadaWgaaqaaiabdweafbqabaaaaOGaeiOla4caaa@3AB0@

In the subsequent sections, we borrow this geometric framework todevelop alternative estimators of α and R.

Moment-based Estimation of a Using Non-transformed Data

We consider two new estimators of α. Instead of basing these on square-root transformed data, we use the data in their original scale. The first estimator of α that we consider depends on the distance from the origin to the center of gravity of k masses of weight N i each placed on the rim of a unit circle in direction θ i = 2πi/k, i.e.,

D = ( 1 k ∑ i = 1 k N i sin ( θ i ) ) 2 + ( 1 k ∑ i = 1 k N i cos ( θ i ) ) 2 . MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemiraqKaeyypa0ZaaOaaaeaadaqadaqaaKqbaoaalaaabaGaeGymaedabaGaem4AaSgaaOWaaabCaeaacqWGobGtdaWgaaWcbaGaemyAaKgabeaakiGbcohaZjabcMgaPjabc6gaUjabcIcaOiabeI7aXnaaBaaaleaacqWGPbqAaeqaaOGaeiykaKcaleaacqWGPbqAcqGH9aqpcqaIXaqmaeaacqWGRbWAa0GaeyyeIuoaaOGaayjkaiaawMcaamaaCaaaleqabaGaeGOmaidaaOGaey4kaSYaaeWaaeaajuaGdaWcaaqaaiabigdaXaqaaiabdUgaRbaakmaaqahabaGaemOta40aaSbaaSqaaiabdMgaPbqabaGccyGGJbWycqGGVbWBcqGGZbWCcqGGOaakcqaH4oqCdaWgaaWcbaGaemyAaKgabeaakiabcMcaPaWcbaGaemyAaKMaeyypa0JaeGymaedabaGaem4AaSganiabggHiLdaakiaawIcacaGLPaaadaahaaWcbeqaaiabikdaYaaaaeqaaOGaeiOla4caaa@610A@

Let D y = 1 k ∑ i = 1 k N i MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaqcfa4aaSaaaeaacqaIXaqmaeaacqWGRbWAaaGcdaaeWaqaaiabd6eaonaaBaaaleaacqWGPbqAaeqaaaqaaiabdMgaPjabg2da9iabigdaXaqaaiabdUgaRbqdcqGHris5aaaa@381F@ sin(θ i ) be the vertical component and D x = 1 k ∑ i = 1 k N i MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaqcfa4aaSaaaeaacqaIXaqmaeaacqWGRbWAaaGcdaaeWaqaaiabd6eaonaaBaaaleaacqWGPbqAaeqaaaqaaiabdMgaPjabg2da9iabigdaXaqaaiabdUgaRbqdcqGHris5aaaa@381F@ cos(θ i ) be the horizontal component of the distance from origin to the center of gravity of the k masses. Let N = ΣN i . From the exact expressions for E[D x |N] and E[D y |N] (see Additional file 1), a first-order approximation for E[D|N] is given by:

E [ D | N ] ≈ E [ D x | N ] 2 + E [ D y | N ] 2 = N α 2 k . MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemyrauKaei4waSLaemiraqKaeiiFaWNaemOta4Kaeiyxa0LaeyisIS7aaOaaaeaacqWGfbqrcqGGBbWwcqWGebardaWgaaWcbaGaemiEaGhabeaakiabcYha8jabd6eaojabc2faDnaaCaaaleqabaGaeGOmaidaaOGaey4kaSIaemyrauKaei4waSLaemiraq0aaSbaaSqaaiabdMha5bqabaGccqGG8baFcqWGobGtcqGGDbqxdaahaaWcbeqaaiabikdaYaaaaeqaaOGaeyypa0tcfa4aaSaaaeaacqWGobGtcqaHXoqyaeaacqaIYaGmcqWGRbWAaaGaeiOla4caaa@51F0@

Setting D equal to E[D|N] and solving for α yields the following moment-based estimator for α:

α ^ D = 2 k D N . MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafqySdeMbaKaadaWgaaWcbaGaemiraqeabeaakiabg2da9KqbaoaalaaabaGaeGOmaiJaem4AaSMaemiraqeabaGaemOta4eaaOGaeiOla4caaa@3632@
(2)

This estimator is the same as Nam's locally reasonable estimator [14]. It can also be derived from least-squares estimation of the parameters of the periodic model:

N i = β 0 + β 1 sin(θ i ) + β 2 cos(θ i ) + ε i ,

from which R is estimated as:

R ^ L S = β ^ 0 + β ^ 1 2 + β ^ 2 2 β ^ 0 − β ^ 1 2 + β ^ 2 2 . MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaadaWgaaWcbaGaemitaWKaem4uamfabeaakiabg2da9KqbaoaalaaabaGafqOSdiMbaKaadaWgaaqaaiabicdaWaqabaGaey4kaSYaaOaaaeaacuaHYoGygaqcamaaDaaabaGaeGymaedabaGaeGOmaidaaiabgUcaRiqbek7aIzaajaWaa0baaeaacqaIYaGmaeaacqaIYaGmaaaabeaaaeaacuaHYoGygaqcamaaBaaabaGaeGimaadabeaacqGHsisldaGcaaqaaiqbek7aIzaajaWaa0baaeaacqaIXaqmaeaacqaIYaGmaaGaey4kaSIafqOSdiMbaKaadaqhaaqaaiabikdaYaqaaiabikdaYaaaaeqaaaaacqGGUaGlaaa@4A79@

This relation also suggests a two-step weighted least-squares estimator of R. In the first step, least squares is used to estimate the parameters in (3) and then predicted values of each N ^ i MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOta4KbaKaadaWgaaWcbaGaemyAaKgabeaaaaa@2E91@ are generated. In the second step, the parameters of (3) are estimated using weighted least squares with weights given by w i = 1/ N ^ i MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOta4KbaKaadaWgaaWcbaGaemyAaKgabeaaaaa@2E91@ . The optimality of these weights assumes that the variance of N i is equal to the expected value of N i . This procedure could be iterated until the estimates and weights converge.

The second estimator of α that we consider is based on the second-order moment statistic D 2. This statistic is appealing because we can express the expected value of E[D 2|N] exactly, whereas E[D|N] is only available to a first-order approximation. Using the exact expressions for E [ D y 2 | N ] MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemyrauKaei4waSLaemiraq0aa0baaSqaaiabdMha5bqaaiabikdaYaaakiabcYha8jabd6eaojabc2faDbaa@35C2@ and E [ D x 2 | N ] MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemyrauKaei4waSLaemiraq0aa0baaSqaaiabdIha4bqaaiabikdaYaaakiabcYha8jabd6eaojabc2faDbaa@35C0@ (see Additional file 1), we see that

E [ D 2 | N ] = E [ D x 2 | N ] + E [ D y 2 | N ] = N k 2 { 1 − α 2 4 + α 2 N 4 } . MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemyrauKaei4waSLaemiraq0aaWbaaSqabeaacqaIYaGmaaGccqGG8baFcqWGobGtcqGGDbqxcqGH9aqpcqWGfbqrcqGGBbWwcqWGebardaqhaaWcbaGaemiEaGhabaGaeGOmaidaaOGaeiiFaWNaemOta4Kaeiyxa0Laey4kaSIaemyrauKaei4waSLaemiraq0aa0baaSqaaiabdMha5bqaaiabikdaYaaakiabcYha8jabd6eaojabc2faDjabg2da9KqbaoaalaaabaGaemOta4eabaGaem4AaS2aaWbaaeqabaGaeGOmaidaaaaakmaacmaabaGaeGymaeJaeyOeI0scfa4aaSaaaeaacqaHXoqydaahaaqabeaacqaIYaGmaaaabaGaeGinaqdaaOGaey4kaSscfa4aaSaaaeaacqaHXoqydaahaaqabeaacqaIYaGmaaGaemOta4eabaGaeGinaqdaaaGccaGL7bGaayzFaaGaeiOla4caaa@5F36@

Solving this expression for α yields the estimator

α ^ ∗ = 2 D 2 k 2 − N N ( N − 1 ) . MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafqySdeMbaKaadaahaaWcbeqaaiabgEHiQaaakiabg2da9iabikdaYmaakaaajuaGbaWaaSaaaeaacqWGebardaahaaqabeaacqaIYaGmaaGaem4AaS2aaWbaaeqabaGaeGOmaidaaiabgkHiTiabd6eaobqaaiabd6eaojabcIcaOiabd6eaojabgkHiTiabigdaXiabcMcaPaaaaSqabaGccqGGUaGlaaa@3F1A@

When D 2 is less than N/k 2 (the expected value of E[D 2|N] at α = 0), this estimator results in invalid (imaginary) estimates of α. To remedy this, we propose the following modified estimator

α ^ D 2 = 2 D 2 k 2 − N f N ( N − 1 ) , MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafqySdeMbaKaadaWgaaWcbaGaemiraqKaeGOmaidabeaakiabg2da9iabikdaYmaakaaajuaGbaWaaSaaaeaacqWGebardaahaaqabeaacqaIYaGmaaGaem4AaS2aaWbaaeqabaGaeGOmaidaaiabgkHiTiabd6eaojabdAgaMbqaaiabd6eaojabcIcaOiabd6eaojabgkHiTiabigdaXiabcMcaPaaaaSqabaGccqGGSaalaaa@417E@

where

f = D 2 k 2 / N 1 + D 2 k 2 / N . MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemOzayMaeyypa0tcfa4aaSaaaeaacqWGebardaahaaqabeaacqaIYaGmaaGaem4AaS2aaWbaaeqabaGaeGOmaidaaiabc+caViabd6eaobqaaiabigdaXiabgUcaRiabdseaenaaCaaabeqaaiabikdaYaaacqWGRbWAdaahaaqabeaacqaIYaGmaaGaei4la8IaemOta4eaaOGaeiOla4caaa@3F22@

This modification insures that the quantity inside the square root is always greater than or equal to zero. For small values of D 2, α ^ D 2 ≈ α ^ D MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafqySdeMbaKaadaWgaaWcbaGaemiraqKaeGOmaidabeaakiabgIKi7kqbeg7aHzaajaWaaSbaaSqaaiabdseaebqabaaaaa@345A@ . As D 2 increases, f converges to 1 and α ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafqySdeMbaKaaaaa@2D84@ D2 corresponds to the estimator using the exact expression for E[D 2|N].

Given an estimate of α, R can be estimated by substituting α ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafqySdeMbaKaaaaa@2D84@ into the formula that relates R to α:

R ^ = 1 + α ^ 1 − α ^ . MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaacqGH9aqpjuaGdaWcaaqaaiabigdaXiabgUcaRiqbeg7aHzaajaaabaGaeGymaeJaeyOeI0IafqySdeMbaKaaaaGccqGGUaGlaaa@36FF@

Ratio estimators such as R ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaaaaa@2D12@ are known to be biased upwards, particularly with sparse data. Later we discuss a bias-correction term for this estimate of R.

Confidence Intervals for R

Constructing confidence intervals for R is problematic because the null value lies on the boundary of the points of support for R. Frangakis and Varadhan recently proposed an approach for computing exact confidence limits for the seasonal relative risk derived from simulation and maximum likelihood estimation of parameters in a circular normal probability model.[19] Their approach can be adapted to estimate confidence intervals for any of the moment estimators proposed in this paper.

The approach involves finding the roots of the function h(R) = | R ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaaaaa@2D12@ - R| - q(R; α), where q(R; α) is the 1 - α quantile of | R ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaaaaa@2D12@ - R|. Note that q(R; α) depends on a particular estimator, although we do not make this explicit in the notation. The lower confidence limit is either zero or the value of the smaller root, whichever is larger. The upper confidence limit is the value of the larger root. Since q cannot be expressed in closed form, it is estimated via simulation. For a given value of R, data are simulated from the probability model and | R ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaaaaa@2D12@ - R| is computed for each simulated data set. In the simulation, the parameter φ can be held fixed at its estimated value. The value of q(R; α) is then estimated by taking the empirical 1 - α quantile of the simulated values of q. The roots of h can be found by using an iterative algorithm.

For the estimators considered in this paper, it is possible that the function h will have only one root. This situation occurs when the number of events is small and/or the seasonality is strong enough so that no upper bound can be placed on the strength of seasonality (the fitted trough of the sine curve is close to zero). When only a single root is found, we set the upper confidence limit to infinity.

While this approach yields confidence intervals that are correct under the assumed probability model, it is computationally intensive and requires specialized software. We also consider a simple ad hoc approach for the estimation of approximate confidence limits for R. This approach is based on a normal approximation to the sampling distribution of log( R ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaaaaa@2D12@ ). We enforce the boundary constraint by truncating the lower confidence limit at one. This procedure yields a lower limit given by:

R ^ L = max ( exp [ ln ( R ^ ) − Z 1 − α / 2 SE ^ ( ln ( R ^ ) ) ] , 1 ) . MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaadaWgaaWcbaGaemitaWeabeaakiabg2da9iGbc2gaTjabcggaHjabcIha4naabmaabaGagiyzauMaeiiEaGNaeiiCaa3aamWaaeaacyGGSbaBcqGGUbGBcqGGOaakcuWGsbGugaqcaiabcMcaPiabgkHiTiabdQfaAnaaBaaaleaacqaIXaqmcqGHsislcqaHXoqycqGGVaWlcqaIYaGmaeqaaOWaaecaaeaacqqGtbWucqqGfbqraiaawkWaaiabcIcaOiGbcYgaSjabc6gaUjabcIcaOiqbdkfaszaajaGaeiykaKIaeiykaKcacaGLBbGaayzxaaGaeiilaWIaeGymaedacaGLOaGaayzkaaGaeiOla4caaa@5616@

The upper limit is unbounded and given by

R ^ U = exp [ ln ( R ^ ) − Z 1 − α / 2 SE ^ ( ln ( R ^ ) ) ] . MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaadaWgaaWcbaGaemyvaufabeaakiabg2da9iGbcwgaLjabcIha4jabcchaWnaadmaabaGagiiBaWMaeiOBa4MaeiikaGIafmOuaiLbaKaacqGGPaqkcqGHsislcqWGAbGwdaWgaaWcbaGaeGymaeJaeyOeI0IaeqySdeMaei4la8IaeGOmaidabeaakmaaHaaabaGaee4uamLaeeyraueacaGLcmaacqGGOaakcyGGSbaBcqGGUbGBcqGGOaakcuWGsbGugaqcaiabcMcaPiabcMcaPaGaay5waiaaw2faaiabc6caUaaa@4EA9@

A first-order Taylor series approximation for the standard error for the sampling distribution of log( R ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaaaaa@2D12@ ) is given by

SE ^ ( ln ( R ^ ) ) ≈ 2 V A ^ R ( α ^ ) ( 1 + α ^ ) ( 1 − α ^ ) . MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaWaaecaaeaacqqGtbWucqqGfbqraiaawkWaaiabcIcaOiGbcYgaSjabc6gaUjabcIcaOiqbdkfaszaajaGaeiykaKIaeiykaKIaeyisISBcfa4aaSaaaeaacqaIYaGmdaGcaaqaaiabbAfawjqbbgeabzaajaGaeeOuaifabeaacqGGOaakcuaHXoqygaqcaiabcMcaPaqaaiabcIcaOiabigdaXiabgUcaRiqbeg7aHzaajaGaeiykaKIaeiikaGIaeGymaeJaeyOeI0IafqySdeMbaKaacqGGPaqkaaGccqGGUaGlaaa@4C12@

For all estimators, V A ^ R ( α ^ ) ≈ 2 / N MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeeOvayLafeyqaeKbaKaacqqGsbGucqGGOaakcuaHXoqygaqcaiabcMcaPiabgIKi7kabikdaYiabc+caViabd6eaobaa@375B@ .

Simulation Study

We compared the various estimators discussed in this paper in a comprehensive Monte Carlo simulation study. Initially, we set k = 12 (corresponding to monthly observations) with n = 150, n = 500, and n = 2500. For each setting of k and n, we simulated data for values of R ranging from 1.05 to 3.05 in increments of 0.25.

For each simulated data set, we evaluate the following five estimators of R:

1. R ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaaaaa@2D12@ E : an estimate of R ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaaaaa@2D12@ using Edwards's estimator of α,

2. R ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaaaaa@2D12@ LS : an estimate of R using least squares,

3. R ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaaaaa@2D12@ D2: an estimate of R using α ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafqySdeMbaKaaaaa@2D84@ D2,

4. R ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaaaaa@2D12@ WLS : an estimate of R using weighted least squares,

5. R ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaaaaa@2D12@ MLE : an estimate of R using the maximum likelihood estimate of α.

We consider various perturbations of these baseline parameters in sensitivity analyses. First, we set k = 52 (corresponding to weekly observations) with n = 1000, n = 5000 and n = 10000. We also simulated data under two different probability models that departed from the assumed model: 1) a negative binomial model with the mean given by Edwards's model (1), but in which the counts were overdispersed with variance given by VAR[N i ] = 1.5E[N i ]; and 2) a model that generated data with a marginal mean given by Edwards's model, but in which the counts were strongly autocorrelated and overdispersed. We created autocorrelation and overdispersion among the observations by simulating N 1 using Edwards's model, and then generating each N i , i = 2, ..., k by simulating Q i from Edwards model and then letting N i = Q i + 0.1{E[N i-1] - N i-1}.

Additionally, we use the simulation results to evaluate the adequacy of the ad hoc confidence interval procedure suggested in section 2.4. For each simulated data set, we compute a 95% confidence interval for R ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaaaaa@2D12@ D2, R ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaaaaa@2D12@ WLS , and R ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaaaaa@2D12@ MLE and record the relative frequency of estimated confidence intervals that contain the true parameter.

Computation

All simulations were performed in SAS V9.1 running on a Windows XP platform using software created by the authors. The maximum likelihood estimates were found using PROC NLMIXED in which the likelihood function (conditional on N) is maximized using a Newton-Raphson algorithm with a line search and boundary constraint (see Additional file 2 for example program). For the Monte Carlo simulation study, the true parameter value was used as the starting point for the maximization routine. The weighted least-squares estimates were obtained in a two-step procedure using PROC GENMOD.

Results

In table 1, we report the bias and MSE from the baseline simulation. For all values of n and R, the new estimator R ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaaaaa@2D12@ D2 had the smallest bias of all those considered. For n = 150, R ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaaaaa@2D12@ D2 also had the smallest MSE for all values of R. For n = 500 and n = 2500, R ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaaaaa@2D12@ D2 had minimal or close to minimal MSE for smaller values of R (R < 1.85); however, for large values of R, R ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaaaaa@2D12@ WLS and R ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaaaaa@2D12@ MLE were better from the MSE perspective. The MSE of the estimator R ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaaaaa@2D12@ LS was similar, but sometimes slightly larger, than that of R ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaaaaa@2D12@ WLS . Edwards's estimator was the most biased and had the largest MSE for all values of n and R. All estimators evaluated were biased upwards for values of R close to unity, a consequence of the behavior of the estimators near the boundary.

Table 1 Estimated bias and MSE for each estimator from the baseline simuala-tion for n = 150, 500, and 2500 based on 1,000 simulated datasets

In the sensitivity analyses in which we generated overdispersed and auto-correlated data, the same essential patterns prevailed. The bias of R ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaaaaa@2D12@ D2 was minimal for all values of R in each scenario. Edwards's estimator was the most biased and had the largest MSE for all values of n and R. In these simulations that depart from the assumed model, the MSE of R ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaaaaa@2D12@ WLS was better than R ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaaaaa@2D12@ MLE for certain values of R and n. This result is likely due to the fact that the MLE is not based on the probability model used to generate the data. In table 2, we report the MSE of R ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaaaaa@2D12@ D2 relative to R ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaaaaa@2D12@ WLS for the overdispersed and auto correlated data-generating distributions, respectively. In these figures, relative MSEs below 1 indicate that R ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaaaaa@2D12@ D2 is preferable from the MSE perspective. Both figures reveal that the relative MSE increases with R. For small values of R and n, R ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaaaaa@2D12@ D2 is preferable. For larger values of R, R ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaaaaa@2D12@ WLS was preferable. These results were the most pronounced in the setting of autocorrelated data. The MSE of R ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaaaaa@2D12@ D2 was never more than 13% greater than R ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaaaaa@2D12@ WLS ; however, it was nearly half as much for small values of R. For the simulations in which k = 52, the estimator R ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaaaaa@2D12@ D2 continued to be the least biased, but there was little difference in MSE between R ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaaaaa@2D12@ D2, R ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaaaaa@2D12@ WLS , and R ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaaaaa@2D12@ MLE in terms of MSE across all values of R.

Table 2 Relative mean squared error of R ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaaaaa@2D12@ D2 to R ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaaaaa@2D12@ WLS

In table 3, we report the estimated coverage probabilities for the ad hoc confidence intervals computed for the estimators R ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaaaaa@2D12@ D2, R ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaaaaa@2D12@ LS , and R ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaaaaa@2D12@ WLS . The actual coverage probabilities are close to correct, usually within one to two percentage points of the nominal 95%. The coverage probabilities for these confidence intervals in the setting of autocorrelation and overdispersion was substantially lower, with actual coverage probabilities ranging from 87% to 96%.

Table 3 Percentage of estimated ad hoc 95% confidence intervals that cover the true parameter

As a side note, the algorithm that we used to find the MLE experienced convergence problems close to R = 1. For R = 1.05, the MLE failed to converge in roughly 20% of the simulated data sets. This problem diminished as R increased. For R = 1.5 the MLE was located for 95% of the simulated data sets. This is likely to be a result of near non-identifiability of φ when the seasonality is weak. More computationally-intensive approaches, such as a grid search, might alleviate this problem; however, in the context of a simulation study, we required an approach that could converge rapidly. For all results discussed below, we excluded simulated data sets for which the MLE was not found. We found that the simulation results for the non-missing estimators were largely unaffected by the inclusion/exclusion of the simulations for which the MLE was not located.

Example: Seasonality of Monocytic Leukemia

We compared the estimators proposed in this paper with the MLE and the estimator of Edwards through a re-analysis of data on the seasonal incidence of monocytic leukemia in England and Wales from 1974–1998 (N = 2311, k = 12) with monthly counts given as (203, 203, 197, 206, 204, 216, 165, 161, 177, 179, 200, 200). We used data from the Office of National Statistics as reported by Eatough [7]. In Table 4, we report the point estimate and approximate 95% confidence limits corresponding to each of the five estimators considered in the simulation study. We also present the confidence limits computed using the method of Frangakis and Varadhan [19]. These confidence intervals could not be computed for the MLE because the convergence problems experienced by the maximization algorithm made the computation of q infeasible.

Table 4 Estimated peak-to-low ratio and 95% CI for the seasonal incidence of monocytic leukemia in England and Wales (1974–98) using four different estimators and two confidence interval procedures

The different estimators do not lead to substantively different interpretations of the data. Nevertheless, consistent with the results of the simulation, the estimators R ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaaaaa@2D12@ D2 are smaller than R LS and Edwards estimator. Given the large number of events and the fact that the data exhibit only moderate seasonality, the simulation study suggests that Edwards estimator should be only moderately biased for these data. The confidence intervals computed by the ad hoc confidence interval procedure were nearly identical to those of Frangakis and Varadhan.

Discussion

In this paper we have proposed a new estimator of the peak-to-low ratio of a periodic process and compared it to several alternative estimators, including Edwards's estimator, the MLE, and weighted least squares. Studies employing Edwards's method often involve very rare events and moderate seasonality. For these studies, the estimator proposed in this paper appears to be optimal. It has less bias and a smaller MSE than any of the estimators considered, including the MLE and weighted least squares. Weighted least squares was preferable from a MSE perspective in the setting of frequent outcomes or strong seasonality. We speculate that the simple estimator proposed in this paper improves upon the estimator of Edwards and the other moment-based estimator because it is based on an exact rather than an approximate expression for the distance from the origin to the center of gravity. We further speculate that the bias and inefficiency in the MLE is due to the small event rates considered in this paper.

The ad hoc confidence interval procedure that we evaluated performed reasonably well for data generated from Edwards's probability model. If more precise confidence intervals are needed, the computationally-intensive approach proposed by Frangakis and Varadhan can be employed [19]. Users should be aware that both of the confidence intervals considered in this paper are model based. If the underlying model is wrong, for example, in the setting of strongly autocorrelated or overdispersed data, the true coverage probabilities may differ from the nominal 95%.

Because ratio estimators are known to be biased upwards, particularly with sparse data, we also considered a bias-corrected estimator based on the expected value of a second-order Taylor series expansion of (1 + α ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafqySdeMbaKaaaaa@2D84@ )/(1 - α ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafqySdeMbaKaaaaa@2D84@ ) around α given by

E [ 1 + α ^ 1 − α ^ ] ≈ 1 + α 1 − α + 2 VAR [ α ^ ] ( 1 − α ) 3 = 1 + α 1 − α ( 1 + 2 VAR [ α ^ ] ( 1 + α ) ( 1 − α ) 2 ) . MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaqbaeqabiqaaaqaaiabdweafnaadmaajuaGbaWaaSaaaeaacqaIXaqmcqGHRaWkcuaHXoqygaqcaaqaaiabigdaXiabgkHiTiqbeg7aHzaajaaaaaGccaGLBbGaayzxaaGaeyisISBcfa4aaSaaaeaacqaIXaqmcqGHRaWkcqaHXoqyaeaacqaIXaqmcqGHsislcqaHXoqyaaGccqGHRaWkjuaGdaWcaaqaaiabikdaYiabbAfawjabbgeabjabbkfasjabcUfaBjqbeg7aHzaajaGaeiyxa0fabaGaeiikaGIaeGymaeJaeyOeI0IaeqySdeMaeiykaKYaaWbaaeqabaGaeG4mamdaaaaaaOqaaiabg2da9KqbaoaalaaabaGaeGymaeJaey4kaSIaeqySdegabaGaeGymaeJaeyOeI0IaeqySdegaaOWaaeWaaeaacqaIXaqmcqGHRaWkjuaGdaWcaaqaaiabikdaYiabbAfawjabbgeabjabbkfasjabcUfaBjqbeg7aHzaajaGaeiyxa0fabaGaeiikaGIaeGymaeJaey4kaSIaeqySdeMaeiykaKIaeiikaGIaeGymaeJaeyOeI0IaeqySdeMaeiykaKYaaWbaaeqabaGaeGOmaidaaaaaaOGaayjkaiaawMcaaiabc6caUaaaaaa@71F5@

This approximation led to the following bias-corrected estimator of R:

R ^ = 1 + α ^ 1 − α ^ ( 1 + 2 V A ^ R [ α ^ ] ( 1 + α ^ ) ( 1 − α ^ ) 2 ) − 1 . MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmOuaiLbaKaacqGH9aqpjuaGdaWcaaqaaiabigdaXiabgUcaRiqbeg7aHzaajaaabaGaeGymaeJaeyOeI0IafqySdeMbaKaaaaGcdaqadaqaaiabigdaXiabgUcaRKqbaoaalaaabaGaeGOmaiJaeeOvayLafeyqaeKbaKaacqqGsbGucqGGBbWwcuaHXoqygaqcaiabc2faDbqaaiabcIcaOiabigdaXiabgUcaRiqbeg7aHzaajaGaeiykaKIaeiikaGIaeGymaeJaeyOeI0IafqySdeMbaKaacqGGPaqkdaahaaqabeaacqaIYaGmaaaaaaGccaGLOaGaayzkaaWaaWbaaSqabeaacqGHsislcqaIXaqmaaGccqGGUaGlaaa@5133@
(4)

We found that estimators based on this correction factor tended to be somewhat over-corrected, possibly because they are based on an approximation of the variance of α ^ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafqySdeMbaKaaaaa@2D84@ .

One important limitation of the estimators proposed in this paper is that they are based on the assumption of a single cyclical effect (harmonic) that can be well approximated by a sine curve. For more complex data, with multiple periodic components or a linear trend, alternative statistical methods should be used. For such data there exist more complex harmonic models [20, 12], spectral methods [21], and various periodic regression models. Also, we outline an approach to estimating seasonal intensity using a periodic generalized linear model that assumes a log link and a Poisson distributed outcome (see Additional file 3). This approach is based on a different model for the mean, i.e., that the log of the expected value of the counts is a sinusoidal function. However, it allows for the inclusion of covariates and extends naturally to variably-sized intervals through use of a Poisson offset.

Edwards's method has been widely used in epidemiology in studies of seasonality. In this paper we have shown that Edwards's estimator of the seasonal relative risk can be substantially biased. The estimator proposed in this paper represents a straightforward modification of Edwards's estimator. Like that of Edwards, it is a simple estimator that is available in closed form. For modest seasonality and small numbers of events, this estimator appears to have the best finite sample performance characteristics of those estimators considered.

For more frequent events or stronger seasonality, the weighted least-squares approach discussed in this paper is preferable and is easily implemented using standard statistical software.

References

  1. Edwards JH: The recognition and estimation of cyclical trends. Ann Hum Genet. 1961, 25: 83-86. 10.1111/j.1469-1809.1961.tb01501.x.

    Article  CAS  PubMed  Google Scholar 

  2. Yamaguchi S, Dunga A, Broadhead RL, Brabin B: Epidemiology of measles in Blantyre, Malawi: analyses of passive surveillance data from 1996 to 1998. Epidemiology and Infection. 2002, 129 (2): 361-369. 10.1017/S0950268802007458.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Mamoulakis C, Antypas S: Cryptorchidism: seasonal variations in Greece do not support the theory of light. Andrologia. 2002, 34 (3): 194-203. 10.1046/j.1439-0272.2002.00492.x.

    Article  PubMed  Google Scholar 

  4. Ajdacic-Gross V, Wang J, Bopp M, Eich D, Rossler W, Gutzwiller F: Are seasonalities in suicides dependant on suicide method?. A reappraisal. Social Science and Medicine. 2003, 57 (7): 1173-1181. 10.1016/S0277-9536(02)00493-8.

    Article  PubMed  Google Scholar 

  5. Anderka M, Declercq E, Wendy S: A time to be born. Am J Public Health. 2000, 90 (1): 124-126. 10.2105/AJPH.90.1.124.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Seretakis D, Lagiou P, Lipworth L, Signorello LB, Rothman KJ, Trichopoulos D: Changing seasonality of mortality from coronary heart disease. JAMA. 1997, 278 (12): 1012-1014. 10.1001/jama.278.12.1012.

    Article  CAS  PubMed  Google Scholar 

  7. Eatough JP: Evidence of seasonality in the diagnosis of monocytic leukaemia. Brit J Cancer. 2002, 87 (5): 509-510. 10.1038/sj.bjc.6600497.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Hewitt D, Milner J, Cisma A, Pakula A: On Edwards's criterion of seasonality and a non-parametric alternative. Brit J Prev Soc Med. 1971, 25: 174-176.

    CAS  Google Scholar 

  9. Roger JH: A significance test for cyclic trends in incidence data. Biometrika. 1977, 64: 152-155. 10.1093/biomet/64.1.152.

    Article  Google Scholar 

  10. Rogerson P: A generalization of Hewitt's test for seasonality. Int J Epidemiol. 1996, 25: 644-648. 10.1093/ije/25.3.644.

    Article  CAS  PubMed  Google Scholar 

  11. Walter S, Elwood J: A test for seasonality of events with a variable population at risk. Br J Prev Soc Med. 1975, 29: 18-21.

    CAS  PubMed  PubMed Central  Google Scholar 

  12. Jones RH, Ford PM, Hamman RF: Seasonality comparisons among groups using incidence data. Biometrics. 1988, 44: 1131-1144. 10.2307/2531741.

    Article  CAS  PubMed  Google Scholar 

  13. St Leger AS: Comparison of two tests for seasonality in epidemiological data. Appl Statist. 1976, 25 (3): 280-286. 10.2307/2347236.

    Article  Google Scholar 

  14. Nam J: Efficient method for identification of cyclic trends in incidence data. Communications in Statistics-Theory and Methods. 1983, 12 (9): 1053-1068. 10.1080/03610928308828515.

    Article  Google Scholar 

  15. Rothman KJ: Episheet: Spreadsheets for the analysis of epidemiologic data. [http://www.drugepi.info/links/downloads/episheet.xls]

  16. Ihaka R, Gentleman RR: A Language for Data Analysis and Graphics. Journal of Computational and Graphical Statistics. 1996, 5: 299-314. 10.2307/1390807.

    Google Scholar 

  17. R Development Core Team: R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. 2003, ISBN 3-900051-00-3,, [http://www.R-project.org]

    Google Scholar 

  18. Savitzky A, Golay MJE: Smoothing and differentiation of data by simplified least squares procedures. Anal Chem. 1964, 36: 1627-1639. 10.1021/ac60214a047.

    Article  CAS  Google Scholar 

  19. Frangakis CE, Varadhan R: Confidence intervals for seasonal relative risk with null boundary values. Epidemiology. 2002, 13: 734-737. 10.1097/00001648-200211000-00022.

    Article  PubMed  Google Scholar 

  20. Pocock SJ: Harmonic analysis applied to seasonal variations in sickness absence. App Stat. 1974, 103-120. 10.2307/2346992.

    Google Scholar 

  21. Chatfield C: The analysis of time series: an introduction. 1996, St Edmundsbury Press Ltd, Suffolk, Fifth

    Google Scholar 

Pre-publication history

Download references

Acknowledgements

The authors are grateful for the helpful comments of Tim Lash and Claus Dethlefsen. M. Alan Brookhart is supported by a career development grant from the National Institute on Aging (AG-027400).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to M Alan Brookhart.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

KR conceived the project. Both authors contributed equally to evaluation and development of statistical methodology. MB carried out programming, simulation, and data analysis. MB drafted the manuscript. Both authors read and approved the final manuscript.

Electronic supplementary material

12874_2007_303_MOESM1_ESM.pdf

Additional file 1: Derivations. The file provides mathematical derivations of several expressions referenced in the paper. (PDF 26 KB)

12874_2007_303_MOESM2_ESM.pdf

Additional file 2: SAS Program. This file provides the SAS program used to located themaximum likelihood estimate of Edwards's model. (PDF 13 KB)

12874_2007_303_MOESM3_ESM.pdf

Additional file 3: Periodic generalized linear model approach to estimating seasonal intensity. The file outlines an approach to estimating the peak-to-low ratio using a periodic generalized linear model. (PDF 22 KB)

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Brookhart, M.A., Rothman, K.J. Simple estimators of the intensity of seasonal occurrence. BMC Med Res Methodol 8, 67 (2008). https://doi.org/10.1186/1471-2288-8-67

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/1471-2288-8-67

Keywords