### Statistical methods

The objective of the statistical analysis is to model changing seasonal variation. Using ordinary Poisson regression, we may analyze consecutive time periods simultaneously, e.g. divide the study period into decades. However, several problems occur when doing so. First, the change between time periods is abrupt, which may seem as a crude assumption. Second, only data within a given time period are considered when estimating the seasonal variation. Letting the time periods become shorter and shorter, the changes from time periods may be less abrupt, and the amount of data becomes smaller providing larger uncertainty on estimates. Nevertheless, the model assumption of independent data is violated, causing potentially non-valid results.

Considering the DGLM from an intuitive perspective, when analyzing weekly data, the estimation of the seasonal variation is performed for every week by considering the nearest 52 weeks and estimating the seasonal variation based on these data. The constraints imposed on the model ensure that the estimated seasonal variation only changes slightly when moving the 52-week window one week forward.

A DGLM consists of three parts. First, the multidimensional state model which specifies how the underlying unobserved process is generated. Second, the observation model which specifies the relation between the latent process and the data, the latter being considered as indirect measurements of the state model, and finally, the initial state model which specifies the initialization of the latent process. The dynamic nature of DGLMs originates in the first-order Markovian evolution of the regression coefficients, which means that the coefficients are allowed to change over time. Modeling gradually changing seasonal variation in incidence rates of hospitalizations with stroke in AF patients, we assumed that both the state model and the initial state model were Gaussian distributed and the observation model was Poisson distributed.

Denoting the time series of frequencies of hospitalizations with stroke in AF patients by {

*y*
_{
t
}|

*t* = 1,…,

*n*} and corresponding number of subjects at risk by offset

_{
t
}, the DGLM is formally denoted

where *ω*
_{
t
}’s are serially uncorrelated and zero-mean Gaussian distributed with unknown and constant covariance matrix *W*. The latent process is denoted by *θ* = (*θ*
_{1},…,*θ*
_{
n
}) and is initialized by the initial state model. It is assumed that the matrices *F*
_{
t
} and *G*
_{
t
} are known along with *m*
_{0} and *C*
_{0}. However, all matrices may depend on a set of hyperparameters, denoted by the vector *φ*.

In case of both state model and observation model being Gaussian, inference on the latent process is obtained by use of the Kalman filter which results in conditional densities. The filter is a recursive updating scheme that updates the DGLM whenever a new data point is available. The Kalman smoother is a backward recursive updating scheme that updates the state model conditional on all data. Hence, Kalman smoothing is a post estimation procedure.

When dealing with non-Gaussian observation models, the Kalman filter and smoother are not immediately applicable. Given the above model, the derivation of the Kalman filter breaks down due to the non-normality of the observation model. As proposed by Durbin and Koopman, 2001
[22], the DGLM is linearized by calculating the first two derivatives of the observation model in a given trial value of the state vector and identifying an approximating Gaussian observation model. This is performed iteratively by applying the Kalman filter and smoother on the approximating Gaussian model to obtain a new trial value of the state vector until convergence is reached. Upon convergence, the likelihood function of the approximating Gaussian model has the same mode and curvature at the mode as the original DGLM. This procedure is commonly referred to as iterated extended Kalman smoothing. All inference procedures assume all matrices being known, including the covariance matrix *W*.

The matrices *F*
_{
t
}and *G*
_{
t
} operate as design matrices for the observation model and the state model and are commonly specified by the modeler for all *t*. In contrast, the covariance matrix *W* which specifies the correlation structure both over time and between coordinates in the evolution of *θ*
_{
t
}, may be unknown and has to be estimated. Given that this covariance matrix is parameterized by some hyperparameters, the objective is to estimate these. Estimation of hyperparameters in a DGLM has been recommended to be performed using the EM algorithm
[23] initially followed by an iterative numerical optimization algorithm
[24–26].

The EM algorithm is a two-step numerical iterative estimation procedure based on maximization of the likelihood function, with the characteristic that for each step the likelihood function does not decrease and that convergence to a stationary point for DGLMs is obtained
[24]. We applied the EM algorithm initially to estimate the hyperparameters in order to reach the region of maximum of the likelihood function
[26]. A convergence criterion, in terms of either a maximum number of iterations or a given threshold, has to be provided by the modeler and may depend on the context.

Upon convergence of the EM algorithm, the estimation procedure is switched to another iterative numerical optimization method in order to pinpoint the actual maximum by maximizing the log likelihood function, since convergence of the EM algorithm may be rather slow
[26]. However, the log likelihood function of a DGLM is not obtainable on closed form; hence, estimation based on sampling may be utilized. As proposed by Durbin and Koopman, 2001
[22], we calculated the exact log likelihood value by use of importance sampling. This sampling technique is a variance reduction simulation scheme which ensures efficient computation time. For further description we refer to
[22].

For the purpose of this study, we specified the linear predictor, *η*
_{
t
}, as a sum of two terms: a secular trend, *T*
_{
t
} and a seasonal variation, *S*
_{
t
}. The secular trend was modeled as a local linear trend which, in the context of DGLM, is not restricted to only increase or decrease constantly. In fact, the trend is allowed to change in both directions due to the dynamic nature of the model. The seasonal variation is modeled as a sum of two sinusoids with periods of 12 and 6 months, respectively. This specification allows for some asymmetry of the seasonal variation. The choice of the numbers of sinusoids to describe the seasonal variation is a trade off between having a flexible model and reducing the complexity of the model, i.e. keeping the number of parameters low. We aggregated the data to seven-day intervals in order to eliminate a possible week-day-effect and analyzed data from 1980 to 2011. The person-time at risk was used as offset, and incidence rates per 100 person-years of stroke in AF patients were analyzed, The peak-to-trough (PTT) ratios were calculated. This PTT ratio is often reported as a measurement of the intensity of seasonal variation in incidence rates and is an incidence rate ratio. Adjustment for possible confounding factors and modeling effect modifiers may be performed with these models, but is, however, beyond the scope of this paper, as we outline the basic usage of the DGLMs in modeling seasonal variation.

Imposing a specific structure on the evolution covariance matrix, *W*, we may ensure separation and interpretation of the coefficients in the state vector
[27]. This structure needs contextual considerations and may not be trivial to determine. For this study, we assumed that the two terms, the trend and the seasonal variation, were independent, and hence, *W* became a block diagonal matrix. Imposing only strictly positive variances, all parameters of the model were allowed to change over time. No structure was assumed for the first block matrix corresponding to the covariance matrix of the trend, for which reason the variance of the level was determined without regard to the variance of the slope, and, in addition, the level and slope were allowed to be dependent, meaning that the covariance was non-zero. We denote the variance of the level by *φ*
_{1}, and the variance of the slope by *φ*
_{2}. The covariance of *φ*
_{1} and *φ*
_{2} is denoted by *φ*
_{3}.

Contrarily, we assumed a specific structure for the second block matrix, corresponding to the covariance matrix of the seasonal variation. Acknowledging that the cosine function is merely an appropriate phase shift of the sine function, the two coefficients of each sinusoid have equal variances but are allowed to differ between the two sinusoids. The variance of the first sinusoid is denoted by *φ*
_{4}, and the variance of the second is denoted by *φ*
_{5}. Furthermore, the two coefficients of each sinusoid were dependent, hence non-zero covariances, denoted by *φ*
_{6} for the first sinusoid and *φ*
_{7} for the second. Consequently, the second block matrix consists of two pairwise equal variances in the diagonal and two covariances. Furthermore, the four covariances between the four coefficients of the two sinusoids were assumed equal, denoted by *φ*
_{8}.

These constraints provided a covariance matrix, *W*, which was parameterized by an eight-dimensional hyperparameter vector, consisting of four variances and four covariances. The specification of the covariance matrix ensures that possible overdispersion in the count data is accounted for, since an additional variance term, besides the normal Poisson variance, is present. Also, serial correlation in data is modeled by allowing non-zero variances in *W*[28].

In order to assess whether the seasonal variation was changing over time, we also fitted a DGLM with the only difference being that the covariance matrix of the seasonal variation contained only zero entries, corresponding to a static seasonal variation. Akaike’s Information Criterion (AIC) was calculated for both models, and the model with lowest AIC was considered parsimonious and in favor of the data
[29].

To identify possible misspecifications of this model, we performed a residual analysis based on residuals proposed by Jørgensen et al., 1999
[28]. The authors define several types of residuals for both the observation model and the latent model, based on either the Kalman filter or smoother, each of which may reveal misspecification of the corresponding model. We considered the residuals of the approximating Gaussian model provided by iterated extended Kalman smoothing.

Statistical software to perform the above calculation is freely available using R[30]. Specification of the DGLM is easily performed using the package sspir, which also includes an implementation of the EM algorithm, specific to this kind of models, Kalman filtering and smoothing, and iterated extended Kalman smoothing
[31, 32]. Using the package KFAS, calculation of the exact likelihood function is available
[33] and, along with standard non-linear optimization algorithm implemented in R, the hyperparameters may be estimated. This procedure is implemented in the function rrd in the package Peak2Trough. For further information we refer to
[34]. All analyses were performed using R version 2.13.2 on a 64bit Intel®Core 2 E7300 2.66GHz CPU 4GB RAM with Ubuntu 11.10.