To elucidate our methodology, we differentiate between arrival, departure, and census counts for the NICU. We define the arrival count on a particular day as the number of patients admitted to the NICU during a 24 hour period. Similarly, the departure count for a particular day is defined as the number of patients who depart the NICU as a result of a healthy discharge during a 24 hour period. By healthy discharge we refer to cases where a patient was discharged from the NICU as a result of adequate physiological health, as determined by clinical criteria. Lastly, we define the daily census count as the number of patients residing in the NICU at the end of the day (11:59pm).
Our approach to modeling the NICU census follows [9] where the NICU census C(t+k) at time (t+k) can be concisely expressed as a function of several different components. More specifically, C(t+k) where k≥1, is a function of:

1.
C(t): NICU census at time (t).

2.
{A(t+1),A(t+2),…A(t+k)}: number of arrivals on each successive day from time (t+1) up to time (t+k).

3.
{D(t+1),D(t+2),…D(t+k)}: number of departures from the NICU census on each successive day from time (t+1) up to time (t+k).
The census at time (t+k) takes the form:
C(t+k)=\underset{\mathit{\text{observed}}}{\underset{\u23df}{C\left(t\right)}}+\underset{\mathit{\text{not observed}}}{\underset{\u23df}{\sum _{i=1}^{k}A\left(t+i\right)\sum _{i=1}^{k}D\left(t+i\right)}}
(1)
so that, the census at time (t+k) is equal to the census at time (t) plus the number of arrivals on each successive day from time (t+1) up to time (t+k), minus the number of departures on each successive day from time (t+1) up to time (t+k). We note that, at the current time (t), the only observable component in model (1) is C(t). As such, the predicted census at time (t+k), \hat{C}(t+k), must contain predictions for the number of arrivals and departures on each successive day up to time (t+k) from time (t+1), \hat{A}(t+i) and \hat{D}(t+i),i=1,2,\dots ,k. Thus, our census forecasting model can be succinctly expressed as:
\hat{C}(t+k)=C\left(t\right)+\sum _{i=1}^{k}\hat{A}(t+i)\sum _{i=1}^{k}\hat{D}(t+i)
(2)
Remark 1
One key assumption we make is that the number of arrivals A(t) is independent of the number of departures D(t) at time (t). In other words, the number of patients arriving in the NICU census at some time (t) does not provide information about the number of patients departing the NICU census at that same time. As will be described in further detail, the assumption of independence between arrivals and departures at some time (t) has important implications when computing the errors associated with our census predictions. In general, this is a sensible assumption when NICUs are operating below their maximum capacity, which was most often the case for the data presented here.
Remark 2
We note that the number of departures D(t+k) at time (t+k) is not independent of the number of arrivals A(t+k−i) at time (t+k−i), where i=1,2,…. That is, there is an upperbound on the number of departures at time (t+k) based on the cumulative number of arrivals from the preceding days.
Remark 3
In (2), the estimate for the census at time (t+k) is given as a point estimate, which is subject to uncertainty given the error in the estimation of the number of arrivals and departures. Stochastic or “ensemblebased” forecasting is used to account for this uncertainty through the use of multiple forecasts created with an individual forecast model. For example, we generate an ensemble of predictions of C(t+k), denoted \left\{{\hat{C}}^{\left(r\right)}\right(t+k),r=1,2,\dots ,M\}, where r represents a single realization and M is an upperbound on the number of realizations, and summarize over the ensemble to obtain more accurate census forecasts. We describe this procedure in further detail in Section “Ensemblebased forecasting and prediction intervals for census forecasts”.
Since obtaining census forecasts at some time (t+k) is contingent upon predictions for the number of arrivals as well as the number of departures from (t+1) up to and including time (t+k), we describe our proposed methodology for predicting the number of arrivals and departures in the paragraphs that follow.
Predicting the number of arrivals
We model daily arrival using the Poisson Autoregressive (PAR) model [10]. This choice is inspired by several studies that have demonstrated that daily arrival patterns in various hospital departments can be modeled as a Poisson process [11–13]. Moreover, this model incorporates the correlation between daytoday arrival counts. The model specification is as follows: let {A(t),t=1,2,…T}, denote a time series of arrival counts. We define {\mathcal{F}}_{t} as any covariate information, including previous arrival counts, available to the observer up to time t. Under the PAR model, denote the conditional expected arrival to be \mathbb{E}\left[A\right(t\left)\right{\mathcal{F}}_{t}]={\mu}_{t}, t=1,2,…,T.
Thus, the conditional mean of the p
^{th} order PAR model using a loglinear link function is given by:
log\left[{\mu}_{t}\right(\mathit{\beta}\left)\right]={\beta}_{0}+\sum _{i=1}^{p}{\beta}_{i}A(ti)
(3)
where β=[β
_{0},β
_{1},…,β
_{
p
}]^{T} is a vector of autoregressive parameters and A(t−i) represents the number of arrivals at time (t−i), i=1,2,…,p. Since arrival counts in the NICU tend to contain a seasonal pattern, we generalize equation (3) to include seasonality effects:
\begin{array}{l}log\left[{\mu}_{t}\right(\mathit{\beta},\varphi \left)\right]={\beta}_{0}+\sum _{k=1}^{K}{\varphi}_{k}cos\left(\frac{2\mathrm{\pi t}{\omega}_{k}}{T}\right)\\ \phantom{\rule{8.7em}{0ex}}+{\alpha}_{k}sin\left(\frac{2\mathrm{\pi t}{\omega}_{k}}{T}\right)+\sum _{i=1}^{p}{\beta}_{i}A(ti)\end{array}
(4)
Here, ω
_{
k
} represents the frequency, equivalently T/ω
_{
k
} is the period or the number of time points (days) needed to complete one cycle and ϕ
_{
k
} and α
_{
k
} capture the amplitude of the k
^{th} seasonality effect. To estimate the seasonality components, we examined the cyclical properties of the autocorrelation function (ACF). This approach is consistent with the fact that the spectrum  which gives a frequency decomposition of the variance in the time series  is the Fourier transform of the autocovariance (or autocorrelation) sequence. In our data, a visual inspection of the sample ACF showed that there is a cosinusuidal pattern with halfyear periodicity (one cycle every 26 weeks). To select the order (p) of the above model, we use the Bayesian Information Criterion (BIC) [14]. After determining the optimal order for the above model, which we call \stackrel{\u0307}{p}, we estimate the expected number of arrivals {\hat{\mu}}_{t}, via:
{\hat{\mu}}_{t}=exp\left[{\hat{\beta}}_{0}+\hat{\varphi}cos\left(\frac{2\mathrm{\pi t\omega}}{T}\right)+\sum _{i=1}^{\stackrel{\u0307}{p}}{\hat{\beta}}_{i}A\left(ti\right)\right]
(5)
where \hat{\mathit{\beta}} and \hat{\varphi} denote the conditional maximum likelihood estimates for β and ϕ respectively, where we treat the first \stackrel{\u0307}{p} arrivals, \left\{A\right(1),A(2),\dots ,A(\stackrel{\u0307}{p}\left)\right\} as fixed.
Predicting the number of departures
To predict the number of departures from a group of patients residing in the NICU, it is ideal to incorporate both patientspecific baseline covariate information (i.e., information collected upon admission to the NICU) and any covariate information collected throughout their stay in the NICU. For the data considered here, only birth weight and gestational age were obtained for each child upon NICU admission. Although both are known to be useful for predicting length of stay, physiologic information collected throughout their time in the NICU may dictate when patients are released and thus, largely impact overall length of stay [15]. In addition, we also seek a framework for predicting the number of departures to reflect the fact that the probability of a patient leaving k days from some time point (t) should be conditional on how many days that patient has already spent in the NICU.
Some notation
Suppose S
_{
i
} represents the number of days the i
^{th} patient has spent in the NICU and X
_{
i
} represents a vector of baseline covariates collected on the i
^{th} patient. Moreover, define {\mathbf{X}}_{i}^{\left(t\right)} as a vector of additional covariates obtained up to and including (t) days after admission to the NICU for the i
^{th} patient. Thus {\mathbf{Z}}_{i}^{\left(t\right)}={[\mathbf{X},{\mathbf{X}}^{\left(t\right)}]}_{i}, is a vector of baseline covariates plus any additional covariate information that is obtained for the i
^{th} patient up to and including (t) days after admission to the NICU. Further, let H
_{
s
}(t) represent the number of patients in the NICU at time (t) who have stayed (s) days in the NICU and suppose that {Y}_{i}^{(k,s)} is an indicator of whether or not patient i leaves the NICU in k days, given that they have already spent (s) days in the NICU. More specifically:
\phantom{\rule{16.0pt}{0ex}}{Y}_{i}^{(k,s)}=\left\{\begin{array}{cc}1& \text{if}\phantom{\rule{1em}{0ex}}\mathit{\text{LO}}{S}_{i}\le k+s,\text{given}\phantom{\rule{1em}{0ex}}\left(s\right)\text{days spent in the NICU}\\ 0& \text{if}\phantom{\rule{1em}{0ex}}\mathit{\text{LO}}{S}_{i}>k+s,\text{given}\phantom{\rule{1em}{0ex}}\left(s\right)\text{days spent in the NICU}\end{array}\right.
where L O S
_{
i
} represents the length of stay in the NICU for patient i. For example, when k=1 and s=0, we have the following,
\phantom{\rule{15.0pt}{0ex}}{Y}_{i}^{(1,0)}=\left\{\begin{array}{cc}1& \text{if}\phantom{\rule{1em}{0ex}}\mathit{\text{LO}}{S}_{i}\le 1\text{given (0) days spent in the NICU}\\ 0& \text{if}\phantom{\rule{1em}{0ex}}\mathit{\text{LO}}{S}_{i}>1\text{given (0) days spent in the NICU}\end{array}\right.
which indicates whether or not patient i leaves the NICU prior to or on 1 day after their admission to the NICU. When k=1 and s=1, {Y}_{i}^{(1,1)} indicates whether or not patient i leaves the NICU in 1 day after having spent 1 day in the NICU. Figure 1 further illustrates the Y
^{(k,s)} coding scheme.
Modeling the probability of departure
Let {\pi}^{\left(k\right)}({\mathbf{Z}}_{i}^{\left(s\right)},{S}_{i}=s)=P({Y}_{i}^{(k,s)}=1{\mathbf{Z}}_{i}^{\left(s\right)},{S}_{i}=s), which represents, at time (t), the probability that patient i, who has spent (s) days in the NICU, leaves the NICU within kdays. This is conditioned on their baseline covariate information as well as covariate information that is obtained for these patients up to and including (s) days after their admission to the NICU. We model the probability of departure from the NICU within kdays from time t among patients who have spent (s) days in the NICU at time t as a function of all available covariate information using the model,
g\left[{\pi}^{\left(k\right)}({\mathbf{Z}}_{i}^{\left(s\right)},{S}_{i}=s)\right]={\mathbf{Z}}_{i}^{\left(s\right)}{\mathit{\beta}}^{(k,s)}
(6)
where β
^{(k,s)} is a vector of model parameters specific to the model above; the superscript k reflects the fact that we are modeling the probability of departure from the NICU prior to or on k days from some time point and the superscript, s denotes that this model is conditional on those who have spent s prior days in the NICU. In the above expression, g(.) is an appropriate link function (i.e. logit, probit, complementary loglog, etc. since Y
^{(k,s)} is binary). Thus, {\hat{\pi}}^{\left(k\right)}({\mathbf{Z}}_{i}^{\left(s\right)},{S}_{i}=s) represents the estimated probability of departing the NICU prior to or on k days from time (t) for the i
^{th} patient who has spent s days in the NICU.
Predicting the number of departures
An estimate of the expected number of departures at time (t+k) from time (t) among patients who have spent (s) prior days in the NICU at time (t) is obtained by summing the predicted responses for each patient,
\hat{D}\left\{\right(t+k);S=s\}=\sum _{i=1}^{{H}_{s}\left(t\right)}{\hat{Y}}_{i}^{(k,s)}
(7)
where H
_{
s
}(t) denotes the number of patients at time (t) who have spent (s) days in the NICU and {\hat{Y}}_{i}^{(k,s)} is the predicted response for patient i, which can be taken as:

1.
{\hat{Y}}_{i}^{(k,s)} = {\hat{\pi}}^{\left(k\right)}({\mathbf{Z}}_{i}^{\left(s\right)},{S}_{i}=s) or,

2.
{\hat{Y}}_{i}^{(k,s)} is a sample from a \mathit{\text{Bernoulli}}\left({\hat{\pi}}^{\left(k\right)}\right({\mathbf{Z}}_{i}^{\left(s\right)},{S}_{i}=s\left)\right).
We opted for the later as it better aligns with the ensemblebased framework described in Section “Ensemblebased forecasting and prediction intervals for census forecasts”. Since the model in equation (7) only considers patients that have spent (s) days in the census at time (t), s
_{
i
}=s for all i. Based on the proposed framework, the total number of expected departures k days from time (t) among N patients residing in the NICU at time (t), is the sum of the expected number of departures for each S=0,1,…,R where R= max(S
_{
i
}), for i=1,2,…,N at time (t),
\hat{D}\left\{\right(t+k\left)\right\}=\sum _{s=0}^{R}\hat{D}\left\{\right(t+k);S=s\}
(8)
Due to sparseness of data for large R and also model feasibility, we propose setting R to a fixed value that remains constant across forecasts at different times. In summary, there are two main stages in our approach to predicting the number of departures among N patients residing in the census at time t. In the first stage, we predict the number of departures among patients who have spent S days in the census, where S={0,1,…,R} using model (7) and in the second stage we sum over all predictions obtained in the first stage to obtain an estimate of the expected number of departures from the census.
It is important to note that formula (8) provides an estimate of the expected number of departures kdays from time t only among the N patients residing in the census at time t. Given that the number of departures at time kdays from time t also depends on subjects that arrive between times t and t+k, reliable estimates of the number of departures must also consider these subjects. Since these subjects are not directly observed, we hereafter refer to them as pseudosubjects.
We estimate the probability of departure kdays from time t for each pseudosubject by taking a random sample  with replacement  of baseline covariates from our available data, {\mathbf{x}}_{i}^{\star}. Using {\mathbf{x}}_{i}^{\star}, we compute the probability of departure, {\hat{\pi}}^{\left(k\right)}({\mathbf{x}}_{i}^{\ast},{s}_{i}=s), prior to or on day (t+k), for the i
^{th}
pseudosubject. Putting this into a broader context, if we predict the number of arrivals on day t+1 to be {\hat{\mu}}_{t+1}, that is \hat{A}(t+1)={\hat{\mu}}_{t+1}, then we sample {\hat{\mu}}_{t+1} observations, with replacement, from our available data and use the baseline covariates from those {\hat{\mu}}_{t+1} observations to predict the number of departures prior to or on day t+k from that group. This process is repeated for day t+2 up to day t+k−1, with \hat{A}(t+2),\dots ,\hat{A}(t+k1) forming the basis of how many psuedosubjects are considered at each time point.
As noticed, forecasting census counts becomes increasingly more complex with increasing levels of uncertainty as one considers longer forecasts. While the above examples present the census forecasts as point predictions, it is often the case that better prediction performance can be achieved by summarizing over an ensemble of such forecasts (i.e. mean or median over the ensemble) [16]. In the section that follows, we present an ensemblebased method that can be simultaneously used to (i) obtain more reliable census forecasts and (ii) obtain prediction intervals for our census forecasts.
Ensemblebased forecasting and prediction intervals for census forecasts
To generate a representative sample of the possible future states of the census, we propose an ensemblebased procedure for obtaining census forecasts and note that this procedure can also be used to construct prediction intervals. As noted in Remark 1, this approach assumes independence between the number of arrivals A(t) and the number of departures D(t) for all t=1,2,…,T. To illustrate the procedure, recall that A(t)∼P o i s s o n(μ
_{
t
}), where
{\mu}_{t}=exp\left[{\beta}_{0}+\varphi cos\left(\frac{2\mathrm{\pi t\omega}}{T}\right)+\sum _{i=1}^{\stackrel{\u0307}{p}}{\beta}_{i}A\left(ti\right)\right]
Let, \mathit{\lambda}=\phantom{\rule{2.77626pt}{0ex}}{[{\beta}_{0},{\beta}_{1},\dots ,{\beta}_{\stackrel{\u0307}{p}},\varphi ]}^{T}, it follows that \hat{\mathit{\lambda}}\stackrel{.}{\sim}\mathit{\text{MVN}}(\mathit{\lambda},{\mathbf{V}}_{\lambda}), where \hat{\mathit{\lambda}} represents the maximum likelihood estimate of λ and V
_{
λ
} is the inverse of the Fisher Information matrix for λ. Additionally, recall that D(t+k)=\sum _{s=0}^{R}D\left\{\right(t+k);S=s\}, where D\left\{\right(t+k);S=s\}=\sum _{i=1}^{{H}_{s}\left(t\right)}{Y}_{i}^{(k,s)}. We have that {Y}_{i}^{(k,s)}\sim \mathit{\text{Bernoulli}}\left({\pi}^{\left(k\right)}\right({\mathbf{Z}}_{i}^{\left(s\right)},{S}_{i}=s\left)\right), where:
{\pi}^{\left(k\right)}({\mathbf{Z}}_{i}^{\left(s\right)},{S}_{i}=s)=\frac{exp\left\{\underset{i}{\overset{\left(s\right)}{\mathbf{Z}}}{\mathit{\beta}}^{(k,s)}\right\}}{1+exp\left\{\underset{i}{\overset{\left(s\right)}{\mathbf{Z}}}{\mathit{\beta}}^{(k,s)}\right\}}
(9)
without loss of generality, assuming a logistic regression model for the probability of departure. It follows then that {\hat{\mathit{\beta}}}^{(k,s)}\stackrel{.}{\sim}\mathit{\text{MVN}}({\mathit{\beta}}^{(k,s)},{\mathbf{V}}_{{\beta}^{(k,s)}}), where {\hat{\mathit{\beta}}}^{(k,s)} represents the maximum likelihood estimate of β
^{(k,s)} and {\mathbf{V}}_{{\beta}^{(k,s)}} represents the inverse Fisher Information matrix for β
^{(k,s)}.
Recall that a point estimate of census at time (t+k) from time (t) is obtained using equation (2). Thus, we generate an ensemble of predictions of C(t+k), denoted {\hat{C}}^{\left(r\right)}(t+k), where:
{\hat{C}}^{\left(r\right)}(t+k)=C\left(t\right)+\sum _{j=1}^{k}{\hat{A}}^{\left(r\right)}(t+j)\sum _{j=1}^{k}{\hat{D}}^{\left(r\right)}(t+j)
(10)
and r denotes a single realization. More precisely, we obtain {\hat{C}}^{\left(r\right)}(t+k) using the following procedure:
For r=1,2,…,M

1.
Denote λ ^{(r)} as a sample from a Multivariate normal distribution with mean \hat{\mathit{\lambda}} and variancecovariance {\hat{\mathbf{V}}}_{\lambda} and β ^{(r)(k,s)} as a sample from a Multivariate normal distribution with mean {\hat{\mathit{\beta}}}^{(k,s)} and variancecovariance {\hat{\mathbf{V}}}_{{\beta}^{(k,s)}}, for s=1,2,…,R.

2.
Conditional on λ ^{(r)} and β ^{(r)(k,s)}, compute {\hat{\mu}}_{t+j}^{\left(r\right)}=\mu \left({\mathit{\lambda}}^{\left(r\right)}\right) and {\hat{\pi}}_{i}^{\left(r\right)\left(k\right)}={\pi}^{\left(k\right)}({\mathbf{z}}_{i}^{\left(s\right)},{s}_{i}=s)\left{\mathit{\beta}}^{\left(r\right)(k,s)}\right) for j=1,2,…,k, i=1,2,…H _{
s
}(t), and s=1,2,…,R.

3.
Based on {\hat{\mu}}_{t+j}^{\left(r\right)} and {\hat{\pi}}_{i}^{\left(r\right)\left(k\right)}, obtain samples {\hat{A}}^{\left(r\right)}(t+j) and {\hat{Y}}_{i}^{\left(r\right)(k,s)} from a \mathit{\text{Poisson}}\left({\hat{\mu}}_{t+j}^{\left(r\right)}\right) and \mathit{\text{Bernoulli}}\left({\hat{\pi}}_{i}^{\left(r\right)\left(k\right)}\right) respectively, for j=1,2,…,k, i=1,2,…H _{
s
}(t), and s=1,2,…,R.

4.
Based on {\hat{A}}^{\left(r\right)}(t+j), j=1,2,…,k−1 sample {\hat{Y}}_{i}^{\star \left(r\right)(kj,0)} for each of the pseudosubjects.

5.
Compute {\hat{D}}^{\left(r\right)}(t+j), for j=1,2,…,k.

6.
Conditional on {\hat{A}}^{\left(r\right)}(t+j) and {\hat{D}}^{\left(r\right)}(t+j), for j=1,2,…,k, Compute {\hat{C}}^{\left(r\right)}(t+k).
End
where M is an upper bound prespecified by the use (i.e. M=1000). Our census forecasts are then obtained by summarizing over the ensemble of predictions, {\hat{C}}^{\left(r\right)}(t+k),r=1,2,\dots M, using for instance, the mean or median. Furthermore, this approach can also be used for the construction of 95% prediction intervals by computing the associated percentiles of the ensemble of predictions.
Up to now, we have considered the components of the census forecasting model, namely the arrivals and the departures, as separate entities. To provide further intuition, we provide an example of our proposed census forecasting model in Additional file 1.