Minimal important difference and patient acceptable symptom state for common outcome instruments in patients with a closed humeral shaft fracture - analysis of the FISH randomised clinical trial data
BMC Medical Research Methodology volume 22, Article number: 291 (2022)
Two common ways of assessing the clinical relevance of treatment outcomes are the minimal important difference (MID) and the patient acceptable symptom state (PASS). The former represents the smallest change in the given outcome that makes people feel better, while the latter is the symptom level at which patients feel well.
We recruited 124 patients with a humeral shaft fracture to a randomised controlled trial comparing surgery to nonsurgical care. Outcome instruments included the Disabilities of Arm, Shoulder, and Hand (DASH) score, the Constant-Murley score, and two numerical rating scales (NRS) for pain (at rest and on activities). A reduction in DASH and pain scores, and increase in the Constant-Murley score represents improvement. We used four methods (receiver operating characteristic [ROC] curve, the mean difference of change, the mean change, and predictive modelling methods) to determine the MID, and two methods (the ROC and 75th percentile) for the PASS. As an anchor for the analyses, we assessed patients’ satisfaction regarding the injured arm using a 7-item Likert-scale.
The change in the anchor question was strongly correlated with the change in DASH, moderately correlated with the change of the Constant-Murley score and pain on activities, and poorly correlated with the change in pain at rest (Spearman’s rho 0.51, -0.40, 0.36, and 0.15, respectively).
Depending on the method, the MID estimates for DASH ranged from -6.7 to -11.2, pain on activities from -0.5 to -1.3, and the Constant-Murley score from 6.3 to 13.5.
The ROC method provided reliable estimates for DASH (-6.7 points, Area Under Curve [AUC] 0.77), the Constant-Murley Score (7.6 points, AUC 0.71), and pain on activities (-0.5 points, AUC 0.68).
The PASS estimates were 14 and 10 for DASH, 2.5 and 2 for pain on activities, and 68 and 74 for the Constant-Murley score with the ROC and 75th percentile methods, respectively.
Our study provides credible estimates for the MID and PASS values of DASH, pain on activities and the Constant-Murley score, but not for pain at rest. The suggested cut-offs can be used in future studies and for assessing treatment success in patients with humeral shaft fracture.
ClinicalTrials.gov NCT01719887, first registration 01/11/2012.
Medical interventions should be aimed at improving patients’ health and well-being. Accordingly, patients’ symptoms and function lie at the heart of evaluating the effects of treatments. Due to their subjective nature, symptoms and function need to be assessed using patient-reported outcome measures (PROMs). Two of the most common PROMs for evaluating treatment outcome in patients with humeral shaft fractures are the Disabilities of the Arm, Shoulder, and Hand (DASH) score and Constant-Murley score [1,2,3]. Patients are also usually queried about the pain they experience.
But what is the minimal benefit that justifies use of a medical intervention? Over the past decades, we have witnessed increasing calls to replace statistical significance with ‘clinical relevance’ – our treatments should generate benefits that patients consider meaningful. To inform the magnitude of such effects on different outcome instruments, two important concepts have been developed: the minimal important difference (MID)  and the patient acceptable symptom state (PASS) .
The MID is “the smallest difference in score in the domain of interest which patients perceive as beneficial and which would mandate, in the absence of troublesome side effects and excessive cost, a change in the patient’s management” . PASS is the symptom level above which patients consider themselves well, providing a tool for determining treatment success . The main difference between MID and PASS is that the MID defines the smallest change in the given outcome that makes people feel better, and PASS defines the level at which the patient feels well.
To our knowledge, there are no previous studies reporting PASS, and only one study reporting MID estimates for two outcome measures (DASH and Constant-Murley score) in patients with humeral shaft fractures . Therefore, we report the MID and PASS analyses of four outcome instruments commonly used to assess treatment outcomes after humeral shaft fractures using data from the Finnish Shaft of the Humerus (FISH) trial .
Design, setting, and participants
The FISH trial was a randomised clinical trial comparing the effectiveness of surgical treatment with open reduction and plate fixation and non-surgical treatment with functional bracing for closed humeral shaft fractures. The execution of the FISH trial has been described in detail previously [3, 6, 7]. The trial was carried out at the Helsinki and Tampere University hospitals in Finland between 2012 and 2018, and conducted in accordance with the Declaration of Helsinki. Participants provided written informed consent upon recruitment.
We included adult patients (18 years and older) with a closed, unilateral, and displaced humeral shaft fracture. Patients were excluded if they had a previous injury or a condition affecting the function of the injured upper limb, pathological fracture, other concomitant injury affecting the same upper limb, other fracture, cognitive disabilities affecting the patient compliance, or polytrauma. Characteristics of participants 6 weeks after the fracture are presented in Table 1.
For the MID and PASS analyses, we included data from all 82 randomised participants and 42 participants who declined to be randomised (opted to choose their preferred treatment) but gave consent for prospective follow-up using the same outcomes as for the FISH trial. Accordingly, the study sample for the analyses consisted of data from 124 participants.
The four outcomes analysed were the DASH score, the Constant-Murley score, and the numerical rating scale (NRS) for pain of the upper extremity, both at rest and on activities. DASH is a validated and responsive questionnaire of self-rated upper extremity disability and symptoms with a score ranging from 0 to 100 (higher is worse) . The Constant-Murley score is a functional assessment score of the shoulder consisting of patients’ estimate of pain and function in daily activities, and measures of range of movement and upper extremity strength. The Constant-Murley score ranges from 0 to 100 (higher is better) . The NRS for pain has been widely used to evaluate clinical pain intensity . Participants are asked to rate their average pain at rest and on activities of daily living during the last 7 days on a 11-point NRS ranging from 0 to 10 (higher is worse).
As the anchor for determining both the MID and the PASS, we used the following subjective global rating question: “How satisfied are you with the overall condition of your injured upper limb and its effect on your daily life?” (for methodological details, see below). The answer options for this anchor question were from 1 to 7 in this order: “Very satisfied”, “Satisfied”, “Somewhat satisfied”, “Not satisfied nor dissatisfied”, “Somewhat dissatisfied”, “Dissatisfied”, and “Very dissatisfied”. All outcomes were collected at 6 weeks, 3, 6, 12, and 24 months after the injury.
Data handling and analyses
Minimal important difference (MID)
MIDs for improvement by of each of the four outcome measures were determined using four methods.
For the three anchor-based methods, we calculated change in each outcome for each previous follow-up point by deducting the earlier score from the later score, thus a negative change in DASH and pain NRS represents improvement and conversely, a negative change in the Constant-Murley score indicates worsening.
For the receiver operating characteristic (ROC) method, we dichotomised the anchor question between better than the previous follow-up point (e.g., from ‘somewhat dissatisfied’ to ‘not satisfied nor dissatisfied’) and not better than the previous follow-up point. The change in the outcome score was calculated always from the previous follow-up time point to the next follow-up point (i.e., change between each follow-up). The optimal discrimination values for the outcome scores (between better and not better in subjective global rating) were determined by ROC analysis using the closest point to top left corner method to maximise specificity and sensitivity . Nonparametric bootstrapping with 1000 replications were used to calculate the 95% confidence interval for ROC MID values . To measure discrimination ability of the obtained cut-off, we calculated the area under the ROC curve (AUC) with 95% CIs by DeLong’s method by bootstrapping 2000 samples .
For the mean difference of the change method, we calculated the difference in outcomes between participants who had improved one point in the subjective global rating from those who had not improved from the previous follow-up.
For the mean change method, we calculated the mean change with 95% confidence intervals (CIs) for the population whose response to the anchor question (subjective global rating) was one point higher than in the previous follow-up point.
For the predictive modelling method, we used logistic regression analysis to calculate MIDs as described by Terluin et al.  In this method, a logistic regression model is used to determine an MID value that optimally predicts the probability of belonging to the improved group. We dichotomised the anchor question as better and not better as described above with the ROC method.
To assess the correlation of anchor and target outcome measures, we calculated Spearman’s rho for the change of the anchor and 1) the change in each of the outcomes, 2) prescores, and 3) postscores . The 95% CIs were defined by bootstrapping 1000 samples.
Patient acceptable symptom state (PASS)
For PASS estimates, we used the ROC method and the 75th percentile method. For the ROC method, we dichotomised the participants based on their responses to the subjective global rating anchor question: those responding “Very satisfied” and “Satisfied” on a 7-item Likert scale were deemed to have reached to a patient acceptable symptom state (PASS) while those responding anything between “Somewhat satisfied” to “Very dissatisfied” were deemed not to have reached the PASS. Determination of the optimal cut off and 95% CIs was carried out in the same way as for the MID.
For the 75th percentile method, we calculated the PASS as the 25th percentile score for the Constant-Murley score, and the 75th percentile score for the DASH score and for the pain-NRS (at rest and on activities) in participants who responded either “Very satisfied” or “Satisfied” on the subjective global rating question.
Primary and secondary analyses
For the primary analysis, we performed the MID and PASS analyses by combining all the different time points into one analysis to obtain a sufficient number of anchor–outcome pairs. We also determined the MID values separately for every follow-up point as a secondary analysis (Tables S1 and S2 of the supplementary appendix).
In the FISH trial, 82 of 140 eligible patients were randomised to surgical (n = 38) or functional bracing (n = 44) groups. Of 58 who declined randomisation, 42 consented to follow-up (declined cohort), providing us with data from 124 participants (Table 1). Of the 42 patients in the declined cohort, nine participants chose surgery and 33 chose functional bracing. Missing data varied from 6 to 14 items at the different follow-up time points .
A change in the anchor question had good correlation with a change in the DASH score (0.51; 95% CI, 0.44 to 0.59). The change in the Constant-Murley score (-0.40; 95% CI, -0.50 to -0.31) was moderately correlated to the anchor. The correlation to pain NRS on activities (0.36; 95% CI, 0.26 to 0.47) was moderate, and poor for pain NRS at rest (0.15; 95% CI, 0.06 to 0.25). Correlations between the postscore of the outcomes and the change of the anchor ranged between -0.01 and 0.06. Correlation between the prescore of the outcomes and the change in the anchor was negative for the DASH score, pain NRS at rest, and pain NRS on activities. The correlation was positive for the Constant-Murley score (Table 2). The correlations at each time point are given in the supplementary appendix Tables S3, S4 and S5.
Depending on the method used, the MID estimates ranged from -6.7 to -11.2 for DASH, from 6.3 to 13.5 for the Constant-Murley score, and from -0.5 to -1.3 for pain-NRS on activities (Tables 3–4). Estimating MID for the pain-NRS at rest would not have been appropriate because the correlation with the anchor was too low. The MID estimates from the ROC method for DASH and the Constant-Murley score proved acceptable discrimination, while the corresponding estimates for pain-NRS on activities discriminated poorly (Table 3). The total number of anchor – outcome data pairs are shown in Table 3, and at each follow-up time point in supplementary appendix Table S4. The distribution of responses to the anchor question at different time points are shown in Fig. S1 of the supplementary appendix. The ROC curves and the MID estimates at all follow-up time points are shown in Fig. S2 and Tables S4 and S5 of the supplementary appendix.
PASS values showed excellent discrimination in the DASH and Constant-Murley scores in the ROC analysis. PASS values discriminated well for pain NRS on activities. It was not appropriate to define PASS value for the pain-NRS at rest due to poor correlation with the anchor. PASS values defined by the 75th percentile method were closer to the best possible score of the outcomes than the estimates obtained from the ROC method (Table 5).
We calculated the MID and PASS estimates for three outcomes in adult patients with closed humeral shaft fractures. We used four methods to calculate the MID and two methods to calculate PASS.
Our MID estimates varied depending on the method used. The change in DASH score had a good correlation, and the change of Constant-Murley score and pain on activities had moderate correlations with the change in anchor question. Pain at rest did not correlate with the anchor question and therefore we were not able to estimate MID or PASS for pain at rest. Taken together, these results indicated credible MID estimates. The ROC method for cut-off values of the MID of both DASH (-6.7 points) and Constant-Murley (7.6 points) scores had an acceptable discrimination. Pain on activities (-0.5 points) discriminated poorly with the ROC method.
The PASS values with the ROC method for DASH (14 points) and Constant-Murley score (68 points) had excellent discrimination. The discrimination was good with the pain on activities (2.5 points). The 75th percentile method yielded more stringent limits for PASS in all the outcomes (DASH, 10 points; the Constant-Murley score, 74 points; pain on activities, 2 points).
We suggest that differences smaller than the smallest point estimates of the MIDs from this study are unlikely to be clinically meaningful. Conversely, differences above the upper limits are very likely to be clinically important to patients. Depending on the potential benefits and inherent risks of treatment methods, researchers may choose either the lower or upper limit of the suggested MID when interpreting the clinical relevance of treatment effects. For PASS, the upper point estimate depicts the cut-off above which the patients are very likely to be satisfied with the treatment outcome and conversely, the lower point estimate reflects the level below which the patients are unlikely to be satisfied.
We identified one previous prospective comparative study on the MID of two different outcomes in patients with humeral shaft fractures reporting the MID of 6.7 points for DASH and 6.1 points for the Constant-Murley score . We could not identify a previous study reporting PASS estimates for patients with humeral shaft fractures. Our estimate for the MID for pain on activities is smaller than in degenerative shoulder conditions [16, 17]. However, due to moderate correlation in pain on activities, our result should be interpreted with caution.
We decided to use a prospective anchor question for our analyses (i.e., patients reported their current symptom state using the subjective global rating as opposed to comparing it to baseline status), which is the method used often in the MID analyses for degenerative conditions. In a trauma setting, it is not possible to obtain reliable baseline data prior the injury. Our approach may be less susceptible to recall bias as the participants did not have to remember their symptoms state several months ago—a task that people tend to fail in [18, 19].
A strength of our study is high internal validity as we used prospective homogenous data from a randomised clinical trial performed by experienced research personnel with little missing data. We also used the most common outcome instruments to assess the outcome of treatment in patients with upper extremity injuries and the methods for obtaining several MID and PASS estimates. In addition, our determination to analyse the MID and PASS was published in the protocol article, prior to any access to trial data .
An obvious limitation of our study is that the results are obtained from a randomised clinical trial with stringent inclusion and exclusion criteria (i.e., adult patients with closed, unilateral humeral shaft fracture without severe comorbidities or compliance problems). Thus, our results may not be directly applicable to all patients with this injury. Second, the ROC analyses can be biased if the proportion of improved participants is markedly different from 50% . However, in our study there were about 420 follow-up intervals and in approximately 250 intervals the patients did not experience improvement, making a marked bias in the estimates unlikely.
Both the MID and PASS are valuable tools both in medical research and clinical practice. The MID provides a tool for future trial sample size calculations. However, when contemplating different treatment methods during shared decision-making in clinical settings, the concept of PASS may be more understandable for patients . The clinician might consider informing the patient about the probable proportion of patients reaching PASS (i.e., feeling well, with an experience of successful treatment) with different treatment options.
We provide credible estimates for the MID and PASS for adult patients with humeral shaft fractures including several of the most used methods and outcomes. Depending on the application, the upper or lower limit of the established MIDs and PASS values should be chosen. The MID might be more useful especially for scientific purposes (i.e., sample size calculation), whereas the PASS concept is—in addition to scientific applications—more understandable to patients, and accordingly, we advocate its use as a more appropriate measure for gauging treatment success in patients with a humeral shaft fracture.
Availability of data and materials
FISH trial data are not publicly available owing to data privacy issues, but access to the anonymised dataset can be obtained from the corresponding author on reasonable request.
Area under curve
the Disabilities of arm, Shoulder, and Hand score
Finnish Shaft of the Humerus trial
Minimal important difference
Numerical rating scale
Patient acceptable symptom state
Patient-reported outcome measure
Receiver operating characteristic
Mahabier KC, Den Hartog D, Theyskens N, Verhofstad MHJ, Van Lieshout EMM, Investigators HT. Reliability, validity, responsiveness, and minimal important change of the disabilities of the arm, shoulder and hand and Constant-Murley scores in patients with a humeral shaft fracture. J Shoulder Elb Surg. 2017;26(1):e1–e12. https://doi.org/10.1016/j.jse.2016.07.072.
Matsunaga FT, Tamaoki MJ, Matsumoto MH, Netto NA, Faloppa F, Belloti JC. Minimally invasive Osteosynthesis with a bridge plate versus a functional brace for humeral shaft fractures: a randomized controlled trial. J Bone Joint Surg Am. 2017;99(7):583–92. https://doi.org/10.2106/jbjs.16.00628.
Rämö L, Sumrein BO, Lepola V, Lähdeoja T, Ranstam J, Paavola M, et al. Effect of surgery vs functional bracing on functional outcome among patients with closed displaced humeral shaft fractures: the FISH randomized clinical trial. JAMA. 2020;323(18):1792–801. https://doi.org/10.1001/jama.2020.3182.
Jaeschke R, Singer J, Guyatt GH. Measurement of health status. Ascertaining the minimal clinically important difference. Control Clin Trials. 1989;10(4):407–15. https://doi.org/10.1016/0197-2456(89)90005-6.
Kvien TK, Heiberg T, Hagen KB. Minimal clinically important improvement/difference (MCII/MCID) and patient acceptable symptom state (PASS): what do these concepts mean? Ann Rheum Dis. 2007;66(Suppl 3):40–1. https://doi.org/10.1136/ard.2007.079798.
Rämö L, Paavola M, Sumrein BO, Lepola V, Lähdeoja T, Ranstam J, et al. Outcomes with surgery vs functional bracing for patients with closed, displaced humeral shaft fractures and the need for secondary surgery: a Prespecified secondary analysis of the FISH randomized clinical trial. JAMA Surg. 2021;156(6):526–34. https://doi.org/10.1001/jamasurg.2021.0906.
Rämö L, Taimela S, Lepola V, Malmivaara A, Lähdeoja T, Paavola M. Open reduction and internal fixation of humeral shaft fractures versus conservative treatment with a functional brace: a study protocol of a randomised controlled trial embedded in a cohort. BMJ Open. 2017;7(7):e014076. https://doi.org/10.1136/bmjopen-2016-014076.
Gummesson C, Atroshi I, Ekdahl C. The disabilities of the arm, shoulder and hand (DASH) outcome questionnaire: longitudinal construct validity and measuring self-rated health change after surgery. BMC Musculoskelet Disord. 2003;4:11. https://doi.org/10.1186/1471-2474-4-11.
Constant CR, Murley AH. A clinical method of functional assessment of the shoulder. Clin Orthop Relat Res. 1987;(214):160–4.
Jensen MP, Karoly P, Braver S. The measurement of clinical pain intensity: a comparison of six methods. Pain. 1986;27(1):117–26. https://doi.org/10.1016/0304-3959(86)90228-9.
Froud R, Abel G. Using ROC curves to choose minimally important change thresholds when sensitivity and specificity are valued equally: the forgotten lesson of pythagoras. Theoretical considerations and an example application of change in health status. PLoS One. 2014;9(12):e114468. https://doi.org/10.1371/journal.pone.0114468.
Terwee CB, Peipert JD, Chapman R, Lai JS, Terluin B, Cella D, et al. Minimal important change (MIC): a conceptual clarification and systematic review of MIC estimates of PROMIS measures. Qual Life Res. 2021;30(10):2729–54. https://doi.org/10.1007/s11136-021-02925-y.
DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44(3):837–45. https://doi.org/10.2307/2531595.
Terluin B, Eekhout I, Terwee CB, de Vet HC. Minimal important change (MIC) based on a predictive modeling approach was more precise than MIC based on ROC analysis. J Clin Epidemiol. 2015;68(12):1388–96. https://doi.org/10.1016/j.jclinepi.2015.03.015.
Devji T, Carrasco-Labra A, Guyatt G. Mind the methods of determining minimal important differences: three critical issues to consider. Evid Based Ment Health. 2021;24(2):77–81. https://doi.org/10.1136/ebmental-2020-300164.
Kanto K, Lahdeoja T, Paavola M, Aronen P, Jarvinen TLN, Jokihaara J, et al. Minimal important difference and patient acceptable symptom state for pain, Constant-Murley score and simple shoulder test in patients with subacromial pain syndrome. BMC Med Res Methodol. 2021;21(1):45. https://doi.org/10.1186/s12874-021-01241-w.
Hao Q, Devji T, Zeraatkar D, Wang Y, Qasim A, Siemieniuk RAC, et al. Minimal important differences for improvement in shoulder condition patient-reported outcomes: a systematic review to inform a BMJ rapid recommendation. BMJ Open. 2019;9(2):e028777. https://doi.org/10.1136/bmjopen-2018-028777.
Kamper SJ, Ostelo RW, Knol DL, Maher CG, de Vet HC, Hancock MJ. Global perceived effect scales provided reliable assessments of health transition in people with musculoskeletal disorders, but ratings are strongly influenced by current status. J Clin Epidemiol. 2010;63(7):760–6.e1. https://doi.org/10.1016/j.jclinepi.2009.09.009.
Schmitt J, Di Fabio RP. The validity of prospective and retrospective global change criterion measures. Arch Phys Med Rehabil. 2005;86(12):2270–6. https://doi.org/10.1016/j.apmr.2005.07.290.
Terluin B, Eekhout I, Terwee CB. The anchor-based minimal important change, based on receiver operating characteristic analysis or predictive modeling, may need to be adjusted for the proportion of improved patients. J Clin Epidemiol. 2017;83:90–100. https://doi.org/10.1016/j.jclinepi.2016.12.015.
Tubach F, Dougados M, Falissard B, Baron G, Logeart I, Ravaud P. Feeling good rather than feeling better matters more to patients. Arthritis Rheum. 2006;55(4):526–30. https://doi.org/10.1002/art.22110.
We thank all the patients who participated in the FISH trial as well as the physical therapists and all other collaborators at the hospital sites and in the community who were involved in conducting the trial.
This study was supported by the state funding for university-level health research in Finland. The funder had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.
Ethics approval and consent to participate
The study protocol was approved by the Institutional Review Board of the Helsinki and Uusimaa Hospital District (118/13/03/02/2012; May 14, 2012) and informed consent was obtained from all participants prior to inclusion in the study. The trial was conducted in accordance with the Declaration of Helsinki.
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Ibounig, T., Juurakko, J., Lähdeoja, T. et al. Minimal important difference and patient acceptable symptom state for common outcome instruments in patients with a closed humeral shaft fracture - analysis of the FISH randomised clinical trial data. BMC Med Res Methodol 22, 291 (2022). https://doi.org/10.1186/s12874-022-01776-6
- Minimal important difference
- Patient accepted symptom state
- Outcome measures
- Constant-Murley score
- Humeral shaft fracture