Minimal important difference, patient acceptable symptom state and longitudinal validity of oxford elbow score and the quickDASH in patients with tennis elbow
BMC Medical Research Methodology volume 23, Article number: 158 (2023)
The Oxford Elbow Score (OES) and the short version of Disabilities of Arms, Shoulder and Hand (QuickDASH) are common patient-reported outcomes for people with elbow problems. Our primary objective was to define thresholds for the Minimal Important Difference (MID) and Patient-Acceptable Symptom State (PASS) for the OES and QuickDASH. The secondary aim was to compare the longitudinal validity of these outcome measures.
We recruited 97 patients with clinically-diagnosed tennis elbow for a prospective observational cohort study in a pragmatic clinical setting. Fifty-five participants received no specific intervention, 14 underwent surgery (11 as primary treatment and 4 during follow-up), and 28 received either botulinum toxin injection or platelet rich plasma injection. We collected OES (0 to 100, higher is better) and QuickDASH (0 to 100, higher is worse), and global rating of change (as an external transition anchor question) at six weeks, three months, six months and 12 months. We defined MID and PASS values using three approaches. To assess the longitudinal validity of the measures, we calculated the Spearman’s correlation coefficient between the change in the outcome scores and external transition anchor question, and the Area Under the Curve (AUC) from a receiver operating characteristics (ROC) analysis. To assess signal-to-noise ratio, we calculated standardized response means.
Depending on the method, MID values ranged from 16 to 21 for OES Pain; 10 to 17 for OES Function; 14 to 28 for OES Social-psychological; 14 to 20 for OES Total score, and − 7 to -9 for QuickDASH. Patient-Acceptable Symptom State (PASS) cut offs were 74 to 84 for OES Pain; 88 to 91 for OES Function; 75 to 78 with OES Social-psychological; 80 to 81 with OES Total score and 19 to 23 with Quick-DASH. OES had stronger correlations with the anchor items, and AUC values suggested superior discrimination (between improved and not improved) compared with QuickDASH. OES also had superior signal-to-noise ratio compared with QuickDASH.
The study provides MID and PASS values for OES and QuickDASH. Due to better longitudinal validity, OES may be a better choice for clinical trials.
ClinicalTrials.gov NCT02425982 (first registered April 24, 2015).
Elbow-specific Oxford Elbow Score (OES), and the short version of the upper limb-specific Disabilities of Arms, Shoulder and Hand (QuickDASH) both use a question-and-response format to convert a complex multidimensional health state into a single metric scale [1, 2]. While a single score facilitates comparisons of outcomes in a trial, it can also be a source of confusion: How to easily integrate a metric score into decision-making in clinical practice, if patients are not familiar with the measure?
Two established concepts help clinicians and patients interpret PROMs: Minimal (clinically) important difference (M(C)ID) and Patient Accepted Symptom State (PASS) [3, 4]. MID represents “the smallest difference in score in the outcome of interest that informed patients or informed proxies perceive as important, either beneficial or harmful, and that would lead the patient or clinician to consider a change in the management” . PASS represents the value of the score beyond which patients consider themselves well . While MID can be used to interpret if the mean difference between groups is relevant  or how large proportion of participants experience meaningful change (responder analysis), PASS can only be used for the latter purpose.
The OES was developed to measure outcomes of elbow surgery . Since the items specifically address limitations due to elbow problems, it is likely well suited to measure outcomes in people with tennis elbow. The OES consists of three subscales: pain, function and social-psychological scale. The QuickDASH is a common upper extremity-specific PROM. The published MIDs for QuickDASH and OES are not specific for people with tennis elbow [6,7,8,9]. It is also unclear whether the OES or the QuickDASH is better suited to measuring change in the health state (i.e. which measure has better longitudinal validity), and we are not aware of studies estimating PASS values for the Oxford Elbow Score.
The primary objective of this study was to estimate MID and PASS values for OES and QuickDASH in people with tennis elbow to facilitate interpretation of future trials and meta-analyses. The secondary aim was to assess the longitudinal validity of these measures to see which outcome was more responsive to change in the health state. A more responsive outcome is more efficient in trials. We used data from a longitudinal/prospective pragmatic observational cohort study involving patients who received different treatments for tennis elbow.
The Helsinki University Hospital institutional review board approved the study protocol. Recruitment began in May 2015 and finished in March 2018.
Design and setting
Participants were referred to one of six study centres (orthopaedic or hand surgery outpatient clinics) in Finland with tennis elbow that was not responsive to usual care.
All patients were screened for eligibility by a hand surgeon or an upper extremity-focussed specialist orthopaedic surgeon at each centre. The diagnosis was based on history and clinical examination. The screening and recruitment protocol did not require any imaging, but imaging findings were considered if they were available.
The inclusion criteria for the cohort study were:
Clinical diagnosis of tennis elbow (pain on the lateral side of the elbow, made worse by pressure applied on the lateral epicondyle of the humerus and during resisted extension of the wrist or when making a tight fist with a straight elbow joint),
Symptom duration of over 10 months,
Age between 35 and 60 years,
Ability to read and comprehend the questionnaires,
Provided informed consent.
The exclusion criteria were:
Inflammatory or neurological condition (including nerve entrapment) affecting upper limb function,
Signs of instability of the elbow joint (table top or lateral pivot shift test),
Radiographic elbow osteoarthritis,
Pain to the distal biceps tendon or the medial side of the elbow indicative of biceps tendinopathy or medial epicondylitis,
Previous surgery, fracture or dislocation of the elbow,
Congenital deformity in the elbow region,
Systemic muscle, tendon, nerve or joint disease,
Painful snapping or crepitus of the elbow joint,
A limitation of passive range of motion of the elbow > 10 degrees,
Abnormal elbow radiograph.
Before enrolment, participants were informed about the study and provided written informed consent. Participants were not taking part in any other study of tennis elbow.
Baseline data and outcome measures
We collected the following information at the baseline during the initial visit at the clinic: Age, sex, duration of symptoms, previous corticosteroid injections, symptomatic side, history of smoking, OES and QuickDASH.
The OES subscales (pain, function, and social-psychological) comprise four questions that are weighed equally . The subscales were originally intended to be used separately, but they can be summed to a total score. The score (each subscale or total) can be converted to a metric scale (0 to 100 points, higher score indicating better outcome).
QuickDASH comprises 10 questions about pain and disability affecting upper limbs in daily activities. It is a short form of DASH questionnaire (30 questions) with comparable clinimetric properties as the DASH . QuickDASH scores are from 0 to 100; higher score indicating worse outcome.
We used Global Rating of Change (GRC) as an external transition anchor question (to define MID and assess longitudinal validity). Participants rated the overall change in health state compared to the baseline measure in a 6-step Likert scale (much worse-little worse-unchanged-little better-much better-complete recovery). For state (to define PASS), we asked the participants whether they would be satisfied with the current state of the global elbow function (yes/no).
All participants were informed about the benign course of the condition. They were primarily offered wait and see strategy, but if they wanted active treatment, they were offered botulinum toxin injection, platelet rich plasma injection or surgery, depending on the centre and patient. Shared decision making was used to decide the course of treatment; the study protocol excluded only further corticosteroid injections.
Data collection and analysis
We collected the outcomes at baseline, six weeks, three months, six months, and 12 months. We calculated the change in OES and QuickDASH scores for each patient by subtracting baseline score from the score at the follow-up point. A positive change score indicates improvement in OES and worsening in the QuickDASH score. Due to small number of participants reporting little better or no change at some time points, we combined all time points (i.e. all transition anchor – target measures pairs were combined in a single analysis).
To assess the longitudinal validity and credibility of MID values, we calculated the Spearman’s correlations between GRC and (1) change scores; (2) absolute scores; and (3) baseline score. Moderate to high (> 0.5) correlation between the anchor and change score suggests validity while values < 0.4 suggest low validity of the anchor.4 We also assessed the Area Under Curve (AUC) -values from the receiver operating characteristics (ROC) analysis to see how well the score could discriminate those who reported feeling better compared with those who remained unchanged.
We determined the MID using three different anchor-based methods: (1) Mean change method (MID = mean change of participants reporting ‘little better’) (2) Mean difference of change method (MID = [mean change of ‘little better’] – [mean change of ‘unchanged’]) (3) ROC curve method (closest point to the upper left corner) dichotomizing between ‘no change’ and ‘little better’ and excluding participants with responses ‘little worse’ or ‘much worse’ .
For PASS, we used 75th percentile method and ROC method from the 6-month and 12-month time points . In the 75th percentile method, PASS was defined as the 25th percentile score for OES (i.e. 75% of participants who considered themselves well had a score above this cut off) and 75th percentile score for QuickDASH (i.e. 75% of the participants who considered themselves well had a score under this cut off). The ROC method was used to find the optimal cut off discriminating between the participants who reported acceptable state and those not who did not.
To assess the signal-to-noise ratio, we calculated standardized response mean (SRM) for both outcomes as mean score change (between baseline and 1 year follow-up) divided by the SD of the change. SRM is the ratio of signal to noise. A SRM of 0.2 can be considered to indicate low responsiveness; 0.5 moderate and 0.8 as large responsiveness .
We recruited 97 participants with clinically diagnosed tennis elbow (Table 1). At baseline, OES and QuickDASH data were available for 93 and 91 participants, respectively (4 and 6 missing due to too many missing items). Seventy-four (76%) participants returned the outcome questionnaires at six weeks, 72 (74%) at three months, 74 (76%) at six months and 75 (77%) at 12 months.
Most (n = 55; 61%) participants did not receive any specific intervention initially; four participants received surgery during follow-up. Smoking was fairly common (27%) among the participants, and most (71%) had received at least one corticosteroid injection before enrolment (range 1 to 6 injections) (Table 1).
Most participants improved during the follow-up (Additional file 1). Both QuickDASH and OES captured the perceived change adequately, but OES change correlated more strongly with the transition anchor compared with QuickDASH change (Additional file 1), and accordingly, OES discriminated better (Table 2). Correlations were: OES Total − 0.76 (95% CI -0.80 to -0.72); OES pain − 0.75 (95% CI -0.80 to -0.71); OES function − 0.64 (95% CI -0.70 to -0.58); OES psycho-social − 0.7 (95% CI -0.75 to -0.66) and QuickDASH 0.57 (95% CI 0.50 to 0.65). The OES and its subscales also had better signal-to-noise ratio (standardized response mean) compared with QuickDASH (Table 3).
For both OES and QuickDASH change scores, the correlations with the transition anchor (GRC) increased at longer follow-up. After three months, the change scores correlated more strongly with the current state than with change score (Additional file 1). The change scores correlated with the baseline score only for OES at 24 months (Additional file 1).
MID = minimal clinically important difference, ROC = receiver operating characteristics, OES = The Oxford Elbow Score, DASH = Disabilities of Arms, Shoulder and Hand, CI = Confidence interval.
In people with tennis elbow, the elbow-specific OES better captured the change in the perceived health than did the upper extremity-specific QuickDASH. The OES better discriminated people who improved from people who reported no change, and the MID and PASS values are therefore more credible for OES than QuickDASH. OES also had superior signal-to-noise ratio, indicating that smaller sample sizes may be used in clinical trials to identify or exclude clinically-relevant differences when OES is used compared to QuickDASH.
The MID reflects the difference in score that informed patients perceive as important (either beneficial or harmful), and that would lead the patient or clinician to consider a change in management. The methods impacted the MID values adding a layer of uncertainty to the interpretation of the results. If we determine the MID using the mean change in people reporting being ‘little better’, approximately half of the participants who reported feeling better will be classified as ‘not improved’— a relatively high rate of misclassification. ROC analysis optimizes the cut off, therefore MID from ROC analysis is more valid for discrimination purposes. Due to the impact of outliers, the ROC method is also more robust to skewed distributions compared to the mean change method or the mean difference of change method. However, ROC analysis is affected by the prevalence of people who report improvement. With larger prevalence, the MID values tend to be larger . There is no consensus about the optimal method, but a recent credibility instrument can be used assess the credibility of the MID estimates .
We propose that differences that are smaller than the lowest values from our study (16 for OES pain; 10 for OES function; 14 for OES psycho-social; 14 for OES pain; and 7 of quickDASH) are judged as not meaningful in people with tennis elbow. For example, authors of a trial or meta-analysis should conclude that the intervention does not provide clinically-important benefits only if the treatment effect confidence intervals are below the smallest provided values. Conversely, higher differences than the highest values (21 for OES pain; 17 for OES function; 28 for OES social-psychological; and 9 for QuickDASH) should be considered indicating relevant benefits. When the difference is within the range of possible MID or PASS values, some uncertainty regarding the conclusions is appropriate.
Our MID estimates correspond reasonably well with the previous studies. The developers of the OES found MID values of 9, 18, and 18 points with pain, function, and social-psychological scale respectively in a heterogenous population undergoing surgery for elbow complaints. One in ten patients in their sample received surgery for tennis elbow . Another study found lower MID values, 7 points for OES pain, 6 points for OES function, 12 points for OES social-psychological and 8 points for OES total score in people with simple elbow dislocation . This demonstrates how the sample and condition can affect the MID estimates. Further, our MID values for QuickDASH also correspond well with previous studies in various elbow and shoulder conditions [7,8,9]. Recently, a consensus paper recommended PRTEE for core outcome set for disability domain in people with tennis elbow . Although the recommendation did not include OES or quickDASH, both quickDASH and OES both scored well in EMPRO scores supporting our findings.
Patient accepted symptom state is a cut off for PROM beyond which people consider being well . In our sample, the cut off was approximately 10–25 points (%) worse than the best possible score. PASS values can be used to calculate the proportion of people achieving satisfactory state in responder analyses if such comparison is considered appropriate to interpret the effect (e.g. NNT calculation). PASS may be a more valid cut off than a change of at least the MID because achieving PASS is not as dependent on the baseline symptom burden [4, 15] – an improvement of MID can be largely meaningless in patients with high baseline symptom burden, whereas starting with a low value, one can reach complete recovery. Furthermore, due to risks and high costs related to surgery, reaching PASS may be a more appropriate treatment objective in surgical trials compared with MID – the smallest improvement patients can perceive may not be sufficient objective for surgery.
Recall bias affected the patient global improvement assessment beyond three months. When participants were asked to report whether they were better compared to baseline, their responses were more closely associated with their current status than how their condition had changed. This is typical with musculoskeletal conditions . Clinicians and researchers must consider this when interpreting global transition measures. Another limitation was the follow-up rate of just under 80% at every time point, which introduces some uncertainty to the estimates as it is not plausible that the data were missing completely at random. Also, the results are applicable only to the study’s instruments. Despite the similarities, the results may not generalise to the full DASH, and a recent Delphi consensus study did not include the OES or quickDASH in a core outcome set for this population . The strength of our study is the high correlation between the anchor and target instrument providing credible MID estimates that discriminate well people who experienced meaningful improvement from those who had not.
This study determined credible MID and PASS values that can be used to interpret trial results. Comparison of longitudinal validity of OES and QuickDASH in tennis elbow patients suggests that although the performance of QuickDASH is acceptable, OES has better signal-to-noise ratio and better longitudinal validity. The OES may be better option as an outcome measure in clinical trials.
The datasets generated and/or analysed during the current study are not publicly available due to data privacy regulations, but are available from the corresponding author on reasonable request.
Beaton DE, Wright JG, Katz JN, Amadio P, Bombardier C, Cole D, et al. Development of the QuickDASH: COmparison of three item-reduction approaches. J Bone Jt Surg - Ser A. 2005 May;87(5):1038–46.
Dawson J, Doll H, Boller I, Fitzpatrick R, Little C, Rees J, et al. The development and validation of a patient-reported questionnaire to assess outcomes of elbow surgery. J bone Jt Surg Br Vol [Internet]. 2008 Apr 1 [cited 2018 May 29];90(4):466–73.
Jaeschke R, Singer J, Guyatt GH. Measurement of health status. Ascertaining the minimal clinically important difference. Control Clin Trials. 1989;10(4):407–15.
Tubach F, Dougados M, Falissard B, Baron G, Logeart I, Ravaud P. Feeling good rather than feeling better matters more to patients. Arthritis Rheum [Internet]. 2006 Aug 15;55(4):526–30.
Schünemann HJ, Oxman AD, Vist GE, Higgins JPT, Deeks JJ, Glasziou P GG. Chapter 12: Interpreting results and drawing conclusions. [Internet]. Version 5. Higgins JPT, Green S, editor. The Cochrane Collaboration; 2011.
Dawson J, Doll H, Boller I, Fitzpatrick R, Little C, Rees J, et al. Comparative responsiveness and minimal change for the Oxford Elbow Score following surgery. Qual Life Res [Internet]. 2008 Dec 29 [cited 2018 May 29];17(10):1257–67.
Franchignoni F, Vercelli S, Giordano A, Sartorio F, Bravini E, Ferriero G. Minimal clinically important difference of the disabilities of the arm, shoulder and hand outcome measure (DASH) and its shortened version (quickDASH). J Orthop Sports Phys Ther. 2014;44(1):30–9.
Hao Q, Devji T, Zeraatkar D, Wang Y, Qasim A, Siemieniuk RAC, et al. Minimal important differences for improvement in shoulder condition patient-reported outcomes: a systematic review to inform a BMJ Rapid Recommendation. BMJ Open. 2019 Feb 20;9(2):e028777.
Iordens GIT, Den Hartog D, Tuinebreijer WE, Eygendaal D, Schep NWL, Verhofstad MHJ, et al. Minimal important change and other measurement properties of the Oxford Elbow score and the Quick Disabilities of the arm, shoulder, and Hand in patients with a simple elbow dislocation; validation study alongside the multicenter FuncSiE trial. PLoS One. 2017;12(9):e0182557.
Turner D, Schünemann HJ, Griffith LE, Beaton DE, Griffiths AM, Critch JN, et al. Using the entire cohort in the receiver operating characteristic analysis maximizes precision of the minimal important difference. J Clin Epidemiol [Internet]. 2009 Apr [cited 2020 Apr 18];62(4):374–9.
Husted JA, Cook RJ, Farewell VT, Gladman DD. Methods for assessing responsiveness: a critical review and recommendations. J Clin Epidemiol. 2000;53(5):459–68.
Terluin B, Eekhout I, Terwee CB. The anchor-based minimal important change, based on receiver operating characteristic analysis or predictive modeling, may need to be adjusted for the proportion of improved patients. J Clin Epidemiol [Internet]. 2017;83:90–100.
Devji T, Carrasco-Labra A, Qasim A, Phillips M, Johnston BC, Devasenapathy N, et al. Evaluating the credibility of anchor based estimates of minimal important differences for patient reported outcomes: instrument development and reliability study. BMJ [Internet]. 2020 Jun 4 [cited 2020 Jun 10];369:m1714.
Bateman M, Evans JP, Vuvan V, Jones V, Watts AC, Phadnis J, et al. Development of a core outcome set for lateral elbow tendinopathy (COS-LET) using best available evidence and an international consensus process. Br J Sports Med. 2022;56(12):657–66.
Dworkin RH, Turk DC, Mcdermott MP, Peirce-Sandner S, Burke LB, Cowan P, et al. Interpreting the clinical importance of group differences in chronic pain clinical trials: IMMPACT recommendations. Pain [Internet]. 2009 Dec 5;146(3):238–44.
Kamper SJ, Ostelo RWJG, Knol DL, Maher CG, de Vet HCW, Hancock MJ. Global Perceived Effect scales provided reliable assessments of health transition in people with musculoskeletal disorders, but ratings are strongly influenced by current status. J Clin Epidemiol [Internet]. 2010;63(7):760–766.
FINITE investigators who recruited and treated the participants but are not authors of this publication were: Toni Luokkala, Olli-Pekka Kangasniemi, Imke Höfling, Olli Leppänen, Matti Juntunen and Markus Pääkkönen.
Toni Luokkala2 Olli-Pekka Kangasniemi2, Imke Höfling4 Olli Leppänen1, Matti Juntunen5 and Markus Pääkkönen6.
4Department of Surgery, Kainuu Central Hospital, Kajaani, Finland
5Department of Orthopedics, Traumatology and Hand Surgery, Kuopio University Hospital, Kuopio, Finland
6Turku University Central Hospital and University of Turku, Turku, Finland
Sigrid Juselius Foundation and the state funding for university level health research (Helsinki University Hospital).
Open access funding provided by Tampere University including Tampere University Hospital, Tampere University of Applied Sciences (TUNI).
Ethics approval and consent to participate
Helsinki University hospital institutional review board approved the study protocol before the commencement of the study on 12th March 2014, amendment 18th March 2015 (65/13/03/02/2014). Informed consent was obtained from all subjects. All methods were performed in accordance with the relevant guidelines and regulations.
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
About this article
Cite this article
Karjalainen, T., Lähdeoja, T., Salmela, M. et al. Minimal important difference, patient acceptable symptom state and longitudinal validity of oxford elbow score and the quickDASH in patients with tennis elbow. BMC Med Res Methodol 23, 158 (2023). https://doi.org/10.1186/s12874-023-01934-4
- Oxford elbow score
- Tennis elbow
- Minimal clinically important change
- Patient accepted symptom state