This article has Open Peer Review reports available.

# Bayes rules for optimally using Bayesian hierarchical regression models in provider profiling to identify high-mortality hospitals

- Peter C Austin
^{1, 2, 3}Email author

**8**:30

https://doi.org/10.1186/1471-2288-8-30

© Austin; licensee BioMed Central Ltd. 2008

**Received: **07 December 2007

**Accepted: **12 May 2008

**Published: **12 May 2008

## Abstract

### Background

There is a growing trend towards the production of "hospital report-cards" in which hospitals with higher than acceptable mortality rates are identified. Several commentators have advocated for the use of Bayesian hierarchical models in provider profiling. Several researchers have shown that some degree of misclassification will result when hospital report cards are produced. The impact of misclassifying hospital performance can be quantified using different loss functions.

### Methods

We propose several families of loss functions for hospital report cards and then develop Bayes rules for these families of loss functions. The resultant Bayes rules minimize the expected loss arising from misclassifying hospital performance. We develop Bayes rules for generalized 1-0 loss functions, generalized absolute error loss functions, and for generalized squared error loss functions. We then illustrate the application of these decision rules on a sample of 19,757 patients hospitalized with an acute myocardial infarction at 163 hospitals.

### Results

We found that the number of hospitals classified as having higher than acceptable mortality is affected by the relative penalty assigned to false negatives compared to false positives. However, the choice of loss function family had a lesser impact upon which hospitals were identified as having higher than acceptable mortality.

### Conclusion

The design of hospital report cards can be placed in a decision-theoretic framework. This allows researchers to minimize costs arising from the misclassification of hospitals. The choice of loss function can affect the classification of a small number of hospitals.

## Keywords

## Background

Public reporting of comparative health care performance by hospitals is part of current public policy in a range of jurisdictions. Several jurisdictions have released public report cards comparing hospital or physician-specific outcomes. California [1], Pennsylvania [2], Scotland [3], and Ontario, Canada [4] have released hospital-specific reports for mortality following admission for acute myocardial infarction (AMI). New Jersey [5], New York [6], Pennsylvania [7], and Massachusetts [8] have published hospital and surgeon-specific mortality rates following coronary artery bypass graft (CABG) surgery, while Ontario has published hospital-specific mortality rates [9].

It has been suggested that there are two main goals behind public reporting [10]. The first is providing information that can guide purchasing decisions by individual consumers or group purchasers such as employers or health maintenance organizations (HMOs). The alternative rationale for public performance reporting is to identify hospitals that require investment in quality improvement initiatives. Implicit in this theoretical framework for hospital report cards is the ability to identify hospitals from which one does not want to seek care, or that require investment in quality improvement initiatives. Furthermore, it implies that we are interested not just in a point estimate of a hospital's mortality rate or relative ranking, but in making a decision about whether to seek care from that hospital or about whether to invest in quality improvement initiatives at that hospital. Normand and Shahian discuss statistical and clinical issues related to provider profiling [11].

Several authors have demonstrated that even if perfect risk-adjustment was possible, random error will result in some hospitals being misclassified [12–14]. Different participants in the health care arena will place different values or costs on different types of misclassifications and on the degree of misclassification. There are two types of misclassifications: false positives (hospitals that truly had acceptably mortality but that were classified as having unacceptably high mortality) and false negatives (hospitals that truly had unacceptably high mortality but were classified as having acceptable mortality). A health care consumer might place higher value on information that minimizes false negatives – they want to avoid purchasing or receiving care from a hospital with unacceptably high mortality. On the other hand, hospitals might put a higher value on information that minimizes false positives – they want to avoid losing business if they are not low performers. The same argument could be made regarding targeting hospitals for quality improvement – false negatives would lead to lost opportunities to invest in quality improvement and false positives would lead to unneeded investment in quality improvement.

Several investigators have suggested the use of Bayesian methods in provider profiling [15–17]. Spiegelhalter et al. used Bayesian methods to examine hospital-specific mortality rates following paediatric cardiac surgery [18]. Thomas et al. used an empirical Bayes model to estimate hospital-level mortality rates for Medicare patients in the United States [19]. Investigators have argued for these methods since they allow profiling to be guided by medical rather than statistical standards, allow hospitals with small caseloads to be included, eliminate regression to the mean bias, and allow the probability of acceptable provider performance to be calculated. Recently Austin examined the reliability and validity of four different Bayesian measures of hospital performance using Monte Carlo simulation methods [20]. Furthermore, the agreement between four different Bayesian methods was examined empirically [21].

Normand et al. proposed a method for provider profiling based upon posterior tail probabilities [16]. Recently, Austin and Brunner used Monte Carlo simulations to assess the accuracy of posterior tail probabilities derived from Bayesian hierarchical regression models for identifying hospitals with higher than acceptable mortality [13]. In doing so, they demonstrated that the use of posterior tail probabilities was the Bayes Rule associated with generalized 1-0 loss functions. However, beyond this initial result, Bayes rules for more complex loss functions have not been derived for hospitals report cards. Furthermore, the impact of assuming different loss functions on the identification of hospitals with higher than acceptable mortality when Bayesian methods are employed has not been explored in the literature.

Accordingly, the objective of the current manuscript is two-fold. First, to develop Bayes rules for several families of loss functions for hospitals report cards when Bayesian hierarchical models are used. Second, to demonstrate the impact of assuming different loss functions on the number of hospitals identified as having unacceptably high mortality.

### Statistical model for classifying hospitals as mortality outliers

In this study, we used methods for provider profiling based on a Bayesian hierarchical regression model. Let p_{ij} denote the probability of mortality for the i^{th} patient treated at the j^{th} hospital. Furthermore, let X_{ij} denote the illness severity of this patient, which was assumed to be centered around the cohort average. Then the following model describes the relationship between mortality and illness severity:

logit(p_{ij}) = β_{0j} + β_{1j}X_{ij},

In this model, it is assumed that $\left(\begin{array}{l}{\beta}_{0j}\\ {\beta}_{1j}\end{array}\right)$, the hospital-specific vector of regression coefficients followed a multivariate normal distribution: $\left(\begin{array}{l}{\beta}_{0j}\\ {\beta}_{1j}\end{array}\right)~MVN\left(\left(\begin{array}{l}{\beta}_{0}\\ {\beta}_{1}\end{array}\right),\left(\begin{array}{cc}{\sigma}_{1}^{2}& {\sigma}_{12}\\ {\sigma}_{12}& {\sigma}_{2}^{2}\end{array}\right)\right)$. Here, β_{0j} denotes the hospital-specific log-odds of death for an average patient, β_{1j} denotes the hospital-specific regression slope relating illness severity to the log-odds of death, and $\left(\begin{array}{cc}{\sigma}_{1}^{2}& {\sigma}_{12}\\ {\sigma}_{12}& {\sigma}_{2}^{2}\end{array}\right)$ denotes the variance-covariance matrix for the hospital-specific random effects. β_{0} denotes the mean hospital-specific log-odds of mortality for a patient, with average disease severity, in the population of hospitals. Finally, β_{1} denotes the mean hospital-specific slope relating illness severity and the log-odds of mortality, in the population of hospitals.

### Bayes Rules for provider profiling

In this section, we develop Bayes rules for specific cost functions. The resultant Bayes rules minimize the posterior expected cost due to incorrectly classifying hospital performance.

#### Notation

Let *π* (*θ*, *X*) denote the prior distribution of the parameter θ, which denotes an individual hospital's random intercept. Let *β*
_{
thresh
}denote the threshold for defining acceptable quality of care. Then *θ*
_{0} = {*θ* |*θ* <*β*
_{
thresh
}} denotes the space of random effects that define hospitals with acceptable mortality, and ${\theta}_{0}^{c}=\left\{\theta \right|\theta >{\beta}_{thresh}\}$ denotes the space of random effects that define hospitals with unacceptably high mortality. Our definition of profiling is based on *θ* = *β*
_{0j
}, the hospital-specific random intercept, which denotes the hospital-specific log-odds of death for an average patient in the population of patients. We exclude the random slopes, *β*
_{1j
}, from the definition of mortality outliers due to a lack of consensus on how to identify performance outliers using both random intercepts and random slopes. Furthermore, the intercept has a readily understandable interpretation that is related to mortality. As discussed by Gajewski et al., the choice of *β*
_{
thresh
}is primarily a clinical, rather than a statistical, decision [22]. For instance, the choice of *β*
_{
thresh
}can be informed by expert opinion as to what constitutes acceptable mortality for an average AMI patient.

Let the decision rule *d* be 1 if one decides that the hospital has poor performance and 0 if one decides that the hospital has acceptable performance. Then, we define a general loss function

*L*(*θ*, *d*) = *f*
_{1}(*θ*)*dI*(*θ* <*β*
_{
thresh
}) + *f*
_{2}(*θ*)(1-*d*)I(*θ* > *β*
_{
thresh
})

Here *dI*(*θ* <*β*
_{
thresh
}) denotes a false positive: classifying a hospital as having higher than acceptable mortality when in reality it has acceptable mortality. *f*
_{1}(*θ*) is the cost occurred for false positives. Similarly, (1-*d*)*I*(*θ* > *β*
_{
thresh
}) denotes a false-negative: classifying a hospital as having acceptable performance when in reality it has poor performance. *f*
_{2}(*θ*) is the cost occurred for false negatives. In the above, *I* denotes the indicator function that takes the value 1 if the condition is true and 0 otherwise. Let H_{0} denote the null hypothesis that the hospital has an acceptable mortality rate. In the following sections, we shall derive Bayes rules for different families of cost functions.

#### Identifying Bayes Rules

Let *R*(*x*, *d* = 0) denote the cost associated with deciding that the hospital has acceptable mortality, while *R*(*x*, *d* = 1) denotes the cost associated with deciding that the hospital has unacceptably high mortality. Then

*R*(*x, d* = 0) = ∫ *f*
_{2}(*θ*)*I*(*θ* > *β*
_{
thresh
})*π* (*θ* |*x*)*dθ*

and

*R*(*x, d* = 1) = ∫ *f*
_{1}(*θ*)*I*(*θ* <*β*
_{
thresh
})*π* (*θ* |*x*)*dθ*

In each case, the integral is over the space of the random effects. Then, to minimize cost, we reject H_{0} only if *R*(*x, d* = 1) <*R*(*x*, *d* = 0). A decision is said to be optimal if it minimizes the posterior expected loss under a specified loss function [23, 24].

#### Cost functions

We consider three different families of loss functions. The first is the family of generalized 1-0 loss functions. The second is the family based on absolute error loss functions, while the third is based on squared error loss. Within each family, we consider symmetric and asymmetric loss functions. Among the set of asymmetric loss functions, we will consider loss functions that penalize false positives more heavily than false negatives, and loss functions that penalize false negatives more heavily than false positives. Symmetric loss functions penalize false positives and false negatives equally.

#### Generalized 1-0 loss functions

In this section, we assume a generalized 1-0 loss function. Let H_{0} denote the null hypothesis that the hospital delivers acceptable quality care, c_{I} be the penalty associated with a type I error, and c_{II} be the penalty associated with a type II error. Then the loss function for generalized 1-0 loss is

*c*
_{
I
}
*dI*(*θ* <*β*
_{
thresh
}) + *c*
_{
II
}(1-*d*)*I*(*θ* > *β*
_{
thresh
})

_{0}only if

Here P_{0} is the optimum posterior tail probability. Thus, the use of a generalized 1-0 loss function results in a Bayes Rule based on posterior tail probabilities. For instance, if *c*
_{
I
}= *c*
_{
II
}= 1, then *P*
_{0} = 0.5. Thus, the use of a 1-0 loss function results in a Bayes Rule that says that a hospital should be classified as having unacceptably high mortality if the posterior tail probability that the hospital-specific random intercept exceeds the threshold of acceptable care with a probability of at least 0.5. The reader is referred to the original article for greater details. One should observe that this result is independent of the distribution of the random effects and of the prior distributions.

We re-express the generalized 1-0 loss function as follows for consistency with the subsequent sections:

*dI*(*θ* <*β*
_{
thresh
}) + *k*(1-*d*)*I*(*θ* > *β*
_{
thresh
})

We have now expressed the Bayes Rule for a generalized 1-0 loss function as an expectation of a function of the model parameters and the cost penalty for false negatives.

These Bayes rules can be evaluated using Markov Chain Monte Carlo (MCMC) methods. To do so, for each hospital, one evaluates the expression $I(\theta >{\beta}_{thresh})-\frac{1}{k+1}$ at each iteration of the MCMC simulation. One then determines the expectation or mean of this quantity over all the iterations of the MCMC simulations. If the mean of this quantity is larger than zero, the hospital is classified as a high-mortality outlier. If the mean of the quantity is less than zero, the hospital is classified as a low-mortality outlier.

Generalized absolute error loss functions

Let H_{0} denote the null hypothesis that the hospital delivers acceptable quality care. Let *β*
_{
thresh
}denote the threshold that denotes acceptable quality of care, and let *θ* denote a given hospital's log-odds of death for an average patient. We define d = 1 to be the decision that a hospital has unacceptably high mortality, while d = 0 is the decision that a hospital has acceptable mortality. We then define the following loss function:

|*θ* - *β*
_{
thresh
}|*dI*(*θ* <*β*
_{
thresh
}) + *k*|*θ* - *β*
_{
thresh
}| (1-*d*)*I*(*θ* > *β*
_{
thresh
})

*k*|

*θ*-

*β*

_{ thresh }|, whereas false positives are penalized with a penalty of |

*θ*-

*β*

_{ thresh }|. Thus, this loss function allows for the possibility of asymmetry. We describe it as a generalized absolute error loss function. Then we have that

The final inequality above is the Bayes Rule for a generalized absolute error loss function. As with the generalized 1-0 loss function, the Bayes Rule has been expressed as an expectation of the model parameters, the threshold for acceptable mortality, and the loss parameter k.

In the special case in which false positives and false negatives are penalized equally, which corresponds to allowing k to be 1 in the above formulas, the Bayes Rule reduces to:

*E* [(*θ* - *β*
_{
thresh
})*I*(*β*
_{
thresh
}- *θ*)] + *E* [(*θ* - *β*
_{
thresh
})*I*(*θ* - *β*
_{
thresh
})] > 0 ⇔

*E* [(*θ* - *β*
_{
thresh
})(*I*(*β*
_{
thresh
}- *θ*)] + *I*(*θ* - *β*
_{
thresh
}))] > 0 ⇔

*E* [*θ* - *β*
_{
thresh
}] > 0

Therefore, in the case of symmetric linear loss, the resultant Bayes rule is that a hospital is classified as a high-mortality outlier if and only if the posterior mean of the hospital random intercept parameter is greater than the chosen threshold.

As with generalized 1-0 loss, these Bayes rules can be evaluated using Markov Chain Monte Carlo (MCMC) methods. To do so, for each hospital, one evaluates the expression (*θ* - *β*
_{
thresh
})*I*(*β*
_{
thresh
}- *θ*)] + *kE* [(*θ* - *β*
_{
thresh
})*I*(*θ* - *β*
_{
thresh
})] at each iteration of the MCMC simulation. One then determines the expectation or mean of this quantity over all the iterations of the MCMC simulations. If the mean of this quantity is larger than zero, the hospital is classified as a high-mortality outlier. If the mean of the quantity is less than zero, the hospital is classified as a low-mortality outlier.

Generalized squared error loss functions

Let *β*
_{
thresh
}and *θ* be as above. We define the following loss function:

(*θ* - *β*
_{
thresh
})^{2}
*dI*(*θ* <*β*
_{
thresh
}) + *k*(*θ* - *β*
_{
thresh
})^{2}(1 - *d*)*I*(*θ* > *β*
_{
thresh
})

*k*(

*θ*-

*β*

_{ thresh })

^{2}, whereas false positives are penalized with a penalty of (

*θ*-

*β*

_{ thresh })

^{2}. Thus, this loss functions allows for the possibility of asymmetry. Then

The final inequality above is the Bayes Rule for a generalized squared error loss function. In the special case in which false positives and false negatives are penalized equally, which corresponds to allowing k to be 1 in the above formulas, the Bayes Rule reduces to:

*E* [(*θ* - *β*
_{
thresh
})^{2}(*I*(*θ* > *β*
_{
thresh
})] - *I*(*θ* <*β*
_{
thresh
}))] > 0

As with generalized 1-0 loss and generalized absolute error loss, this decision rule can be computed using Markov Chain Monte Carlo (MCMC) methods.

### Case study

In this section we apply the Bayes rules derived above to a specific dataset to examine the impact of assuming different loss functions on the identification of hospitals with unacceptably high mortality.

#### Data sources

We used data on all 19,757 patients discharged from hospital with a most responsible diagnosis of acute myocardial infarction (AMI) between April 1, 2000 and March 31, 2001 from the 163 acute care hospitals in Ontario, Canada that treated at least 1 AMI patient during the 12 month period. Creation of this dataset is described in detail elsewhere [4, 25].

Adjustments for differences in case-mix were done using the Ontario AMI mortality prediction rule for 30-day mortality, whose derivation and validation are described elsewhere [26]. The variables comprising the prediction rule consisted of age, gender, cardiac severity (e.g., congestive heart failure, cardiogenic shock, arrhythmia, and pulmonary edema), and comorbid status (e.g., diabetes mellitus with complications, stroke, acute and chronic renal disease, and malignancy), as derived from the ICD-9 codes present in the 15 secondary diagnostic fields of the hospitalization database.

An illness severity score was derived as the predicted probability of 30-day mortality using age, gender and the 9 risk factors and comorbidities comprising the Ontario AMI mortality prediction rule. This method of constructing an illness severity score has been used elsewhere [27, 28]. The illness severity scores were then standardized so as to have mean 0 and variance 1. Thus, a patient with average disease severity had a disease severity score of 0.

#### Cost functions

In this case study we consider 9 different loss functions: three generalized 1-0 loss functions, 3 generalized absolute error loss functions, and 3 generalized squared error loss functions. We used values of k of 1, 2, and 0.5. Thus, with k = 1, false positives and false negatives are equally penalized, with k = 2, false negatives are penalized twice as heavily as false positives, and with k = 0.5, false positives are penalized twice as heavily as false negatives.

#### Acceptable mortality

In the case study, the threshold *β*
_{
thresh
}was chosen so that a hospital is defined to have higher than acceptable mortality if the odds of death are 50% higher at this hospital than at an average hospital, which is consistent with analyses described by Normand et al. [16]. Therefore, *β*
_{
thresh
}= *β*
_{0} + log(1.5)

#### Model estimation

the mean vector $\left(\begin{array}{l}{\mu}_{0}\\ {\mu}_{1}\end{array}\right)=\left(\begin{array}{r}\hfill -2.06\\ \hfill 0.91\end{array}\right)$,

the variance-covariance matrix $\left(\begin{array}{cc}{\tau}_{11}& {\tau}_{12}\\ {\tau}_{12}& {\tau}_{22}\end{array}\right)=\left(\begin{array}{cc}10000& 0\\ 0& 10000\end{array}\right)$,

${\left(\begin{array}{cc}{\sigma}_{1}^{2}& {\sigma}_{12}\\ {\sigma}_{12}& {\sigma}_{2}^{2}\end{array}\right)}^{-1}~\phantom{\rule{0.1em}{0ex}}Wishart\left(\left(\begin{array}{cc}2\times 0.08139& 2\times 0.02342\\ 2\times 0.02342& 2\times 0.02243\end{array}\right),2\right)$. The mean vector $\left(\begin{array}{l}{\mu}_{0}\\ {\mu}_{1}\end{array}\right)=\left(\begin{array}{r}\hfill -2.06\\ \hfill 0.91\end{array}\right)$ was chosen based on fitting the random effects model to OMID data from 1999 (the year prior to the data used in this case study).

Twelve parallel MCMC chains were run starting from different initial values drawn from an over-dispersed distribution. Each of the 12 MCMC chains was run for an initial 5,000 burn-in iterations. Each Gibbs sampler was then monitored for an additional 10,000 iterations using a thinning interval of 10, resulting in 1,000 iterations for analysis from each sampler. Convergence of the Gibbs sampler was assessed using the Gelman-Rubin criterion for parallel runs from a Gibbs sampler [31]. The Gelman and Rubin shrink factors for the median and 97.5^{th} percentiles were all no larger than 1.01, indicating that the parallel chains had converged. All analyses were then conducted using the last 500 iterations from each of the 12 parallel chains. The 6000 sampled values (500 iterations/chain × 12 chains) from the posterior distribution were used for the subsequent analyses.

The MCMC estimate of each Bayes Rule for each of the 163 hospitals was determined using the 6000 samples from the posterior distribution of the model parameters. We then determined which hospitals were classified as having unacceptably high mortality using each of the different Bayes rules. This was done by evaluating the appropriate Bayes Rule at each iteration of the Gibbs sampler using the current sample from the posterior distribution. The posterior expectation was computed using the mean of these sampled values from the posterior distribution. If the posterior mean was greater than zero, then the hospitals was classified as having unacceptably high mortality.

## Results

The number of hospitals classified as having unacceptably high mortality ranged from a low of 1 to a high of 8, depending on which loss function was used.

When false positives and false negatives were equally penalized (k = 1 in the associated loss function), then three hospitals (hospitals A, B, and C) were classified as having unacceptably high mortality, regardless of whether 1-0 loss, absolute error loss, or squared error loss was employed.

When false positives incurred twice the penalty that false negatives incurred (k = 0.5 in the loss functions), then only one hospital (hospital A) was classified as having unacceptably high mortality when 1-0 loss and absolute error loss were employed. However, when squared error loss was employed, two hospitals were classified as having unacceptably high mortality (hospitals A and B).

When false negatives incurred twice the penalty that false positives incurred (k = 2 in the loss functions), then the number of hospitals classified as having unacceptably high mortality was either 4, 5, or 8, depending on the loss function employed. When generalized 1-0 loss was used, then eight hospitals (hospitals A, B, C, D, E, F, G, and H) were classified as having higher than acceptable mortality. When absolute error loss was used, then 5 hospitals (hospitals A, B, C, D, and E) were classified as having higher than acceptable mortality. Finally, when squared error loss was used, then four hospitals (hospitals A, B, C, and E) were classified as having unacceptably high mortality.

One hospital (hospital A) was classified as having unacceptably high mortality in all 9 analyses. In examining these results, one could conclude that hospital A would be classified as having unacceptably high mortality by all participants in the health care system.

For a given family of loss functions (generalized 1-0 loss family; generalized absolute error loss family; generalized squared error loss), the largest number of hospitals were classified as having unacceptably high mortality when false negatives were penalized more heavily than false positives (scenarios with k = 2) than when either false positives were penalized more heavily than false negatives (k = 0.5) or when false negatives and false positives were equally penalized (k = 1). Similarly, penalizing false negatives and false positives equally (k = 1) resulted in the identification of at least one additional high-mortality hospital compared to when false positives were penalized more heavily than false negatives (k = 0.5), regardless of which family the loss function came from (generalized 1-0 loss family; generalized absolute error loss family; generalized squared error loss).

*β*

_{0}+ log(1.5)), along with dashed vertical lines depicting the end points of the 95% credible interval for this threshold. In Figure 2, we display the posterior distribution of the random intercepts for the eight hospitals that were classified as high-mortality outliers using at least one loss function. One notes that of the eight hospitals that were classified as high-mortality outliers at least once, those hospitals that were identified as such more frequently tend to have posterior distributions that are shifted further to the right compared to hospitals that were identified less frequently.

For comparative purposes, model-based indirect standardization was used to identify hospitals that had higher than expected mortality [32]. This method has been used in several cardiovascular hospital report cards [2, 4, 7, 9]. The logistic regression model adjusted for the 11 demographic and clinical variables contained in the Ontario AMI mortality prediction model that was described above. For each hospital, the ratio of observed to expected mortality (O/E ratio) was determined. Those hospitals whose O/E ratio was significantly higher than 1 were classified as having higher than expected mortality. Using this procedure, 15 out of the 169 hospitals were classified as having higher than expected mortality (hospitals A to H, in addition to seven other hospitals). Model-based indirect standardization identified substantially more hospitals as being performance outliers than did the Bayesian hierarchical method using any of the different cost functions.

## Discussion

There is a growing interest in the publication of health care report cards in which the outcomes of medical or surgical care are compared across hospitals or physicians. Several commentators have advocated for the use of Bayesian hierarchical models in provider profiling. In the current study we developed Bayes rules for identifying hospitals with higher than acceptable mortality when Bayesian hierarchical regression models are employed. We adapted three of the most commonly used loss functions (generalized 1-0 loss functions; absolute error loss; squared error loss) to the setting of provider profiling. Each of the Bayes rules that we derived was based on the posterior expectation of a function of the model parameters, the threshold for acceptable mortality, and the cost function being larger than zero. The posterior expectation of each of these functions can be calculated using MCMC methods. We illustrated our findings on a sample of 19,757 patients hospitalized with an AMI at 163 hospitals in Ontario. We found that the number of hospitals identified as having unacceptably high mortality was affected by the relative penalty assigned to false negatives compared to false positives. The choice of family of loss function tended to have less of an impact on which hospitals were classified as having higher than acceptable mortality compared to the choice of the relative penalty for false negatives compared to false positives.

The Bayes Rules for estimating parameters for generalized 1-0 loss functions, for squared error loss, and for absolute error loss are well known [24]. The latter two Bayes Rules are the mean and median of the sampled values, while the Bayes Rule for the first is the ratio of one cost of misclassification to the sum of the two costs of misclassification. While Bayes Rules are well known for estimating simple parameters, their use has not been explored in the context of using hierarchical models to identify hospitals with higher than acceptable mortality rates. In this setting, one is not estimating a parameter, but whether a parameter (the hospital-specific random effect) exceeds a specified threshold. Furthermore, the Bayes Rules that we derived for linear and quadratic loss functions are different than those for estimating simple parameters. This illustrates that prior results on Bayes Rules for simple parameters are not applicable in the setting of provider profiling.

In hospital profiling it is important to be able to correctly identify those hospitals that have unacceptably high mortality rates. For reasons of patient safety, it is important that hospitals with higher than acceptable mortality rates be correctly identified, so that the reasons for their poor performance can be investigated. However, it is also important that those hospitals that truly have acceptable mortality rates not be incorrectly identified as having higher than acceptable mortality rates. Hospitals that are incorrectly identified as having unacceptably high mortality rates face undue public criticism and damage to their reputation. Furthermore, resources are needlessly wasted in seeking to determine the reasons for the hospital's poor performance. Additionally, closing a hospital that is incorrectly identified as providing poor quality care can result in patients being denied local treatment at a hospital that truly provides acceptable quality of care. In hospital profiling there is a delicate balance that must be maintained between correctly identifying those hospitals that truly are performance outliers and not falsely labeling as outliers those hospitals that are not truly performance outliers. Different participants in health care are likely to have different perspectives on this trade-off. We have demonstrated that the choice of loss function for quantifying the cost incurred due to misclassifying hospitals can have an impact upon which hospitals are identified as having higher than acceptable mortality. There is a need for all participants in the health care system; physicians, patients, administrators, and funders; to debate the tradeoffs that have been explicitly incorporated into the analyses described in this study. Once this has been done, report cards can be explicitly produced using methodology that minimizes the cost arising from misclassification.

There is a paucity of research into developing decision-theoretic frameworks for hospital report cards. Working from a frequentist perspective and a simple model for hospital performance, Austin and Anderson derived optimal p-values required for classifying hospitals with higher than expected mortality so as to minimize the expected loss due to incorrect classification [33]. Austin and Brunner used Monte Carlo simulations to determine the accuracy of posterior tail probabilities, as measured using sensitivity and specificity, for identifying hospitals with higher than acceptable mortality [13]. In doing so, they demonstrated that the Bayes rules for generalized 1-0 loss functions were based on posterior tail probabilities. However, beyond this initial result, Bayes rules had not been developed for other, more realistic loss functions. In our case study, we found that the choice between generalized 1-0 loss, absolute error loss, and squared error loss had minimal impact upon which hospitals were identified as having unacceptably high mortality.

In conclusion, we have explicitly developed Bayes rules for several families of loss functions when Bayesian hierarchical regression models are used to identify hospitals with unacceptably high mortality. We also examined the impact of assuming different loss functions on the number of hospitals that are identified as having unacceptably high mortality.

## Conclusion

The design of hospital report cards can be placed in a decision-theoretic framework. This allows researchers to minimize costs arising from the misclassification of hospitals. The choice of loss function can affect the classification of a small number of hospitals. We found that the number of hospitals classified as having higher than acceptable mortality is affected by the relative penalty assigned to false negatives compared to false positives. However, the choice of loss function family had a lesser impact upon which hospitals were identified as having higher than acceptable mortality.

## Declarations

### Acknowledgements

The Institute for Clinical Evaluative Sciences (ICES) is supported in part by a grant from the Ontario Ministry of Health and Long Term Care. The opinions, results and conclusions are those of the author and no endorsement by the Ministry of Health and Long-Term Care or by the Institute for Clinical Evaluative Sciences is intended or should be inferred. Dr. Austin is supported in part by a New Investigator award from the Canadian Institutes of Health Research (Institute for Health Services and Policy Research). This research was supported in part by an operating grant from the Natural Sciences and Engineering Research Council (NSERC) of Canada.

## Authors’ Affiliations

## References

- Luft HS, Romano PS, Remy LL, Rainwater J: Annual report of the California Hospital Outcomes Project. 1993, Sacramento, CA: California Office of Statewide Health Planning and DevelopmentGoogle Scholar
- Pennsylvania Health Care Cost Containment Council: Focus on heart attack in Pennsylvania. Research methods and results. 1996, Harrisburg, PA: Pennsylvania Health Care Cost Containment CouncilGoogle Scholar
- Scottish Office: Clinical outcome indicators, 1994. 1995, Edinburgh: Clinical Resource and Audit GroupGoogle Scholar
- Tu JV, Austin P, Naylor CD, Iron K, Zhang H: Acute myocardial infarction outcomes in Ontario. Cardiovascular health services in Ontario: an ICES atlas. Edited by: Naylor CD, Slaugher PM. 1999, Toronto, Ontario: Institute for Clinical Evaluative Sciences, 83-110.Google Scholar
- Jacobs FM: Cardiac Surgery in New Jersey in 2002: A Consumer Report. 2005, Trenton, NJ: New Jersey Department of Health and Senior ServicesGoogle Scholar
- New York State Department of Health: Coronary artery bypass graft surgery in New York State 1989–1991. 1992, Albany, NY: New York State Department of HealthGoogle Scholar
- Pennsylvania Health Care Cost Containment Council: A Consumer Guide to Coronary Artery Bypass Graft Surgery. 1995, Harrisburg, PA: Pennsylvania Health Care Cost Containment Council, 4:Google Scholar
- Massachusetts Data Analysis Center: Adult Coronary Artery Bypass Graft Surgery in the Commonwealth of Massachusetts, Fiscal Year 2006 Report (October 1, 2005 – September 30, 2006). Hospital and Surgeon Standardized 30-day Mortality rates. 2008, Boston, MA: Department of Health Care Policy, Harvard Medical SchoolGoogle Scholar
- Naylor CD, Rothwell DM, Tu JV, Austin PC, the Cardiac Care Network Steering Committee: Outcomes of Coronary Artery Bypass Graft Surgery in Ontario. Cardiovascular health services in Ontario: an ICES atlas. Edited by: Naylor CD, Slaughter PM. 1999, Toronto, Ontario: Institute for Clinical Evaluative Sciences: Toronto, Canada, 189-197.Google Scholar
- McGlynn EA: Introduction and overview of the conceptual framework for a national quality measurement and reporting system. Medical Care. 2003, 41: 1-I. 10.1097/00005650-200301000-00001.View ArticleGoogle Scholar
- Normand SLT, Shahian DM: Statistical and clinical aspects of hospital outcomes profiling. Statistical Science. 2007, 22: 206-226. 10.1214/088342307000000096.View ArticleGoogle Scholar
- Austin PC, Alter DA, Tu JV: The accuracy of fixed and random effects models in calculating risk-adjusted mortality rates: A Monte Carlo assessment. Medical Decision Making. 2003, 23: 526-539. 10.1177/0272989X03258443.View ArticlePubMedGoogle Scholar
- Austin PC, Brunner LJ: Optimal Bayesian probability levels for hospital report Cards. Health Services and Outcomes Research Methodology. 10.1007/s10742-007-0025-4.Google Scholar
- Thomas JW, Hofer TP: Accuracy of risk-adjusted mortality rates as a measure of hospital quality of care. Medical Care. 1999, 37: 83-92. 10.1097/00005650-199901000-00012.View ArticlePubMedGoogle Scholar
- Burgess JF, Christiansen CL, Michalak SE, Morris CN: Medical profiling: improving standards and risk adjustments using hierarchical models. Journal of Health Economics. 2000, 19: 291-309. 10.1016/S0167-6296(99)00034-X.View ArticlePubMedGoogle Scholar
- Normand SL, Glickman ME, Gatsonis CA: Statistical methods for profiling providers of medical care: issues and applications. Journal of the American Statistical Association. 1997, 92: 803-814. 10.2307/2965545.View ArticleGoogle Scholar
- Christiansen CL, Morris CN: Improving the statistical approach to health care provider profiling. Annals of Internal Medicine. 1997, 127: 764-768.View ArticlePubMedGoogle Scholar
- Spiegelhalter DJ, Aylin P, Best NG, Evans SJW, Murray GD: Commissioned analysis of surgical performance using routine data: lessons from the Bristol inquiry. J R Statist Soc A. 2000, 165: 191-231. 10.1111/1467-985X.02021.View ArticleGoogle Scholar
- Thomas N, Longford NT, Rolph JE: Empirical Bayes methods for estimating hospital-specific mortality rates. Statistics in Medicine. 1994, 13: 889-903. 10.1002/sim.4780130902.View ArticlePubMedGoogle Scholar
- Austin PC: The reliability and validity of Bayesian methods for hospital profiling: A Monte Carlo assessment. Journal of Statistical Planning and Inference. 2005, 128: 109-122. 10.1016/j.jspi.2003.10.006.View ArticleGoogle Scholar
- Austin PC: A comparison of Bayesian methods for profiling hospital performance. Medical Decision Making. 2002, 22: 163-172. 10.1177/02729890222063044.View ArticlePubMedGoogle Scholar
- Gajewski BJ, Petroski G, Thompson S, Dunton N, Wrona M, Becker A, Coffland V: Letter to the editor: the effect of provider-level ascertainment bias on profiling nursing homes by Roy J, Mor V. Statistics in Medicine. 2006, 25: 1976-1977. 10.1002/sim.2480.View ArticlePubMedGoogle Scholar
- DeGroot MH: Optimal Statistical Decisions. 1970, New York, NY: McGraw-HillGoogle Scholar
- Berger JO: Statistical Decision Theory and Bayesian Analysis. 1980, New York, NY: Springer-VerlagView ArticleGoogle Scholar
- Tu JV, Naylor CD, Austin P: Temporal changes in the outcomes of acute myocardial infarction in Ontario, 1992–96. Canadian Medical Association Journal. 1999, 161: 1257-1261.PubMedPubMed CentralGoogle Scholar
- Tu JV, Austin PC, Walld R, Roos L, Agras J, McDonald KM: Development and validation of the Ontario acute myocardial mortality prediction rules. Journal of the American College of Cardiology. 2001, 37: 992-7. 10.1016/S0735-1097(01)01109-3.View ArticlePubMedGoogle Scholar
- Tu JV, Austin PC, Chan BTB: Relationship between annual volume of patients treated by admitting physician and mortality after acute myocardial infarction. Journal of the American Medical Association. 2001, 285: 3116-3122. 10.1001/jama.285.24.3116.View ArticlePubMedGoogle Scholar
- Alter DA, Naylor CD, Austin PC, Tu JV: Long-term MI outcomes at hospitals with or without on-site revascularization. Journal of the American Medical Association. 2001, 285: 2101-8. 10.1001/jama.285.16.2101.View ArticlePubMedGoogle Scholar
- Gilks WR, Richardson S, Spiegelhalter DJ: Introducing Markov chain Monte Carlo. Markov chain Monte Carlo in practice. Edited by: Gilks WR, Richardson S, Spiegelhalter DJ. 1996, London, UK: Chapman & Hall, 1-19.Google Scholar
- Gilks WR, Thomas A, Spiegelhalter DJ: A language and program for complex Bayesian modelling. The Statistician. 1994, 43: 169-78. 10.2307/2348941.View ArticleGoogle Scholar
- Gelman A, Rubin DB: Inference from iterative simulation using multiple sequences. Statistical Sciences. 1992, 7: 457-472. 10.1214/ss/1177011136.View ArticleGoogle Scholar
- Iezzoni LI, Ed: Risk adjustment for measuring healthcare outcomes. 1997, Chicago, IL: Health Administration Press, SecondGoogle Scholar
- Austin PC, Anderson GM: Optimal statistical decisions for hospital report cards. Medical Decision Making. 2005, 25: 11-19. 10.1177/0272989X04273142.View ArticlePubMedGoogle Scholar
- The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1471-2288/8/30/prepub

### Pre-publication history

## Copyright

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.