Data acquisition
Patient data
We have previously described the development and assessment of the ARDS cohort in this study, which included 362 patients who met the Berlin Definition of ARDS at four hospitals in the Chicago region in 2013 [20]. Patient data was obtained from the electronic health records serving the participating hospitals. For this study, we use height and sex to calculate predicted body weight (sex neutral surrogate for height) as well as all tidal volumes (mL/kg) and PaO2/FIO2 ratios (mmHg) available during the patient’s disease course. These data were collected as part of an observational study focused on understanding ARDS recognition and management, and implementation of LTVV. The methods and insights gleaned from this work will be incorporated into a larger multicenter study of ARDS recognition and management, and barriers to implementation of LTVV (NIH R01 HL140362).
Physician data
We have previously described the survey used in this study, which included the critical care physicians who were identified as caring for the patients in the ARDS cohort described above [26]. The survey included questions on physician attitudes towards LTVV and innovation in general, perceived barriers and facilitators to LTVV use, and professional and social connections with other ICU physicians. Physicians who met cohort inclusion criteria but were missing data points were only excluded from the analyses that requires those missing data points. Data availability is reported in Additional File, Supp Table 1.
Calculation of ARDS recognition
Our ARDS recognition metric that compares an individual physician’s observed ARDS recognition to that physician’s expected ARDS recognition given their specific patient census. The calculation of each physician’s recognition metric includes only the data generated during that physician’s specific pairing with his/her patients. Due to different data collection procedures at different clinical sites, we were only able to calculate the ARDS recognition metric for the largest site in our previous study [20].
Observed recognition
For each patient in a physician’s census, we assign a label of “recognized” or “not recognized.” This label is inferred from the standardized tidal volume selected by the clinician to be delivered to the patient. This inference is based on our previously developed model of physician recognition of ARDS. Previously, we quantified the impact of patient characteristics on physician recognition of ARDS and subsequent LTVV delivery, by comparing physician behavior with ARDS patients to physician behavior with a novel hypoxemic ‘control’ cohort [27]. We found that the largest confounding characteristics in both ARDS and control cohorts was patient height (reported as the sex-neutral ‘predicted body weight’). We developed a model that accounts for this by dividing the predicted body weight (PBW) vs standardized tidal volume (mL/kg PBW) space into “recognized” and “not recognized” regions (Fig. 2A). Patients in the “recognized” region experience physician behavior more similar to that exhibited with ARDS patients as compared to physician behavior seen with the control patients. In this work, we map each physician’s patient census to this space and infer their observed individual recognition, Nobs, as the number of their patients falling within the “recognized” region.
Expected recognition
From a patient outcomes perspective, a physician’s expected recognition would be 100% of the ARDS patients they see. However, the goal of our recognition metric is to measure a physician’s progress relative to another physician, their progress over time, and/or to an institution’s past average; it is not intended to replace overall missed diagnosis rates. Thus, we use the current group average of all physicians within the same institution as a physician’s expected recognition performance. The idea is that by identifying both high and low performers, the institution will be better able to learn which physicians should be targeted in order to improve overall institutional performance.
In order to accomplish this, we must account for the severity of ARDS (measured by the hypoxemia categories set forth in the Berlin definition) because we and others have previously demonstrated that the severity of ARDS has an impact on a physician’s ability to recognize ARDS, with sicker patients being easier to recognize [7, 22]. To establish a baseline expected recognition rate for each physician that accounts for this influence, we used the following equation:
$${N}_{exp}\left(\left\{{h}_1,{h}_2,\dots {h}_{N_j}\right\}\right)= floor\left[{\sum}_{i=1}^{N_j}R\left({h}_i\right)\right]$$
(1)
where:
Nexp: expected number of patients to be recognized.
hi: hypoxemia severity category (mild, moderate, or severe) [5] of patient i.
R(hi): institutional level recognition rates of mild, moderate, or severe patients [27] (Fig. 2B).
Nj: number of patients cared for by physician j.
The recognition rates, R(hi), in Eq. 1 are for the whole ARDS cohort by hypoxemia severity, which we estimated in prior work via mixture model as 22% for mild hypoxemia (PaO2/FIO2 in range 200-300 mmHg), 34% for moderate hypoxemia (PaO2/FIO2 in range 100-200 mmHg), and 67% for severe (PaO2/FIO2 < 100 mmHg) [27]. Expected recognition is rounded down to the nearest whole patient to account for the binary nature of ARDS diagnosis.
ARDS recognition metric
Our ARDS recognition metric R (Fig. 2C) compares the cumulative probabilities of the observed and expected recognition scenarios
$$R=P\left(\le {N}_{obs}\right)-P\left(\le {N}_{exp}\right).$$
(2)
To calculate R, consider a physician with a patient census {h1, …, hNj}, and that for each patient i in the census there is an institutional level recognition rate, R(hi), appropriate for the patient’s hypoxemia severity (Fig. 2B). The expected number of recognized patients for that physician’s census is.
$$N_{\exp}=N_{j}\;\left(f_{\mathrm{mild}}\;\mathrm R\left(h_{\mathrm{mild}}\right)+f_{\mathrm{interm}}\;\mathrm R\left(h_{\mathrm{interm}}\right)+f_{\mathrm{severe}}\;\mathrm R\left(h_{\mathrm{severe}}\right)\right),$$
(3)
where the three subscripts refer to mild, intermediate, and severe hypoxemia, and f is the fraction of patients in the census with a given hypoxemia severity. Since the number of patients in a physician’s census may not be large and because Nexp is not necessarily an integer, calculating P(≤Nobs) or P(≤Nexp) is more easily done by simulation then by enumeration. Thus, we generated 1000 sequences of recognized/not-recognized outcomes for the physician’s patient census according to the recognition probability of each patient’s hypoxemia severity (see Additional File, Supplemental Methods, Probability Density Function Distribution Generation). This process enabled us to estimate the probability of each number between 0 and Nj of recognized patients for each physician. By using the cumulative probability (Eq. 2), we ensure that physicians recognizing more patients than expected are assigned positive performance values, while physicians recognizing less patients are assigned negative values. Physicians performing at the expected level for their peer group are rated at 0.
ARDS recognition metric covariates
Metric robustness evaluation
We used univariable ordinary least squares (OLS) regression to assess the robustness of our recognition metric. We evaluated whether our metric showed any correlation with key variables including predicted body weight, hypoxemia (lowest PaO2/FIO2), total number of patients treated, and mortality proportion within each physician’s census. For predicted body weight, we used summary statistics of the physician’s patient census (mean, median, proportions in the central, single standard deviation, and second standard deviation ranges) and for hypoxemia, we used the proportion of the patient population with severe hypoxemia (PaO2/FIO2 100).
Physician characteristics
We sought to evaluate the relative associations between physician recognition of ARDS and physician characteristics that have been previously shown to have an impact on clinical decision-making and use of evidence-based practices: physician demographics [28], social network position [29,30,31,32,33,34,35,36,37,38,39], and attitude survey responses. For demographic variables and social network attributes, we used a feedforward OLS regression approach with our recognition metric as the dependent variable and physician characteristics as independent variables. Demographic univariable analysis was performed first, as demographic characteristics have been previously shown to affect network connections [40,41,42]. Demographic variables included: training background (specialty), age, sex, and year of training completion (ordinal and before/after ARDSNet LTVV trial [11]).
Next, we constructed four different social networks (patient contact [43], advice seeking, friendship, and innovation) and calculated 8 positional metrics for each physician (betweenness, closeness, degree, Katz centrality, k-shell embeddedness, participation, role, and community membership). For detailed descriptions of network construction and each positional metric, see Additional File, Supplemental Methods, Network Construction, and Additional File, Supp Table 2. All centrality characteristics (betweenness, closeness, degree, and Katz) were calculated using the Networkx Python package (v 1.11), except embeddedness which was calculated using custom code [44]. Participation, role, and community membership were calculated using netcarto (v1.15). All positional metrics were normalized for the number of physicians in the network, except community membership, which was treated as a categorical variable. Significant demographic variables were included as a fixed effect in multivariable OLS regressions with positional metrics as an additional independent variable. Each positional metric was evaluated in a separate regression.
For the survey response analysis, we used both an individual question and a collective group approach. Survey questions (non-demographic, non-network) were first filtered for those that showed a maximum range of responses. To examine associations between individual physician survey responses and physician recognition, we used a Kruskal-Wallis H-test to evaluate differences in recognition between categories of survey answer for each question (Python Scipy package v0.18.1). To evaluate for differences in survey response between groups of physicians, we split physicians by the significant demographic or positional metric identified in the prior feedforward regression analysis. We then used a Mann-Whitney U test to assess differences in the responses from these groups to the same filtered question pool (Python Scipy package v0.18.1).
Sensitivity analyses
All analyses were repeated using two alternative ARDS recognition measures that have been previously used in literature: 1) the proportion of worked shifts during which the physician delivered LTVV and 2) the proportion of patients that a physician cared for who received LTVV at any point during their disease course. For these alternative measures, we used a strict interpretation of LTVV use (defined as ≤6.5 mL/kg PBW) as put forth by the original ARDSNet LTVV trial [11]. These measures do not adjust for the impact of patient height on standardized tidal volume (mL/kg PBW) or the influence of ARDS severity on clinician recognition of ARDS. All regression results using these alternative recognition metrics as the dependent variable were consistent with the regression results when our recognition metric was used.
Statistical significance
We used α = 0.01 instead of 0.05 to ensure the statistical strength of our findings [45] and applied the Bonferroni correction for multiple hypotheses. There were 140 comparisons where our recognition metric was the dependent variable, thus we set 7 × 10− 5 (0.01/140) as the threshold for statistical significance for these analyses. For the survey analysis, there were 20 questions evaluated, resulting in a threshold of 5 × 10− 4 (0.01/20).