- Research article
- Open Access
- Open Peer Review
Risk of bias of randomized controlled trials published in orthopaedic journals
BMC Medical Research Methodologyvolume 13, Article number: 76 (2013)
The purpose of this study was to assess the quality of methodology in orthopaedics-related randomized controlled trials (RCTs) published from January 2006 to December 2010 in the top orthopaedic journals based on impact scores from the Thompson ISI citation reports (2010).
Journals included American Journal of Sports Medicine; Journal of Orthopaedic Research; Journal of Bone and Joint Surgery, American; Spine Journal; and Osteoarthritis and Cartilage. Each RCT was assessed on ten criteria (randomization method, allocation sequence concealment, participant blinding, outcome assessor blinding, outcome measurement, interventionist training, withdrawals, intent to treat analyses, clustering, and baseline characteristics) as having empirical evidence for biasing treatment effect estimates when not performed properly.
A total of 232 RCTs met our inclusion criteria. The proportion of RCTs in published journals fell from 6% in 2006 to 4% in 2010. Forty-nine percent of the criteria were fulfilled across these journals, with 42% of the criteria not being amendable to assessment due to inadequate reporting. The results of our regression revealed that a more recent publication year was significantly associated with more fulfilled criteria (β = 0.171; CI = −0.00 to 0.342; p = 0.051).
In summary, very few studies met all ten criteria. Thus, many of these studies likely have biased estimates of treatment effects. In addition, these journals had poor reporting of important methodological aspects.
Randomized controlled trials (RCTs) provide strong evidence for efficacy of healthcare interventions . Carefully planned and well-executed RCTs give us the best estimates of treatment effect and can thus guide clinical decision making [2, 3], although trials that lack methodological rigor cause over- or underestimation of treatment effect sizes due to bias [4–6]. Hence, efforts have been undertaken toward improving the design and reporting of RCTs [1, 6–11].
While RCTs represent a small proportion of original research published in surgical journals [12, 13], they still represent an important component of the literature and a high level of evidence . But, this literature appears to indicate that surgical RCTs lag behind the general literature in terms of methodological quality. Methodological quality mainly refers to the formal aspects of study design, performance and analysis. For example, one study found that only 33% of RCTs published in surgical journals but 75% published in general medicine journals were of high quality . RCTs of orthopaedic surgery appear to be no better, with greater than half of the RCTs in one study lacking proper concealment of randomization, blinding of outcome assessors and reporting of reasons for excluding patients . In another study looking at the quality of RCTs in pediatric orthopaedics, the authors found that only 19% of the included articles met their criteria for high quality . In contrast, it appears that RCTs published in general internal medicine journals is of generally of higher quality. For example, Moher et al. included 211 reports of RCTs from the top four English-language internal medicine journals and found that greater than 60% of RCTs were of high quality . Therefore, it is obvious that RCTs in orthopaedic surgery are in need of improvement.
It is important to note the difference between methodological quality and reporting quality. Our study is designed to evaluate the methodological conduct of studies; however poor reporting can innately make this task difficult. While it is imperative to decipher between reporting and methodology, it can be tempting to draw similar conclusions from both. This will ultimately hamper a true risk of bias assessment and must not be carried out.
To our knowledge, there has not been an assessment of the methodological quality, or risk of bias, of RCTs across the top journals in orthopaedics. Nor has there been an effort to characterize the proportions of published papers that represent the highest levels of evidence. The purpose of the present study was to assess the risk of bias of all randomized trials published in the last 5 years of the top five journals in orthopaedics.
We determined the top five journals in orthopaedic surgery by their impact scores from the Thompson ISI citation reports. These journals included the American Journal of Sports Medicine (AJSM), Journal of Orthopaedic Research (JOR), Journal of Bone and Joint Surgery, American (JBJS Am), Spine Journal (SJ) and Osteoarthritis and Cartilage (OC). These journals were hand searched on the journal’s website and assessed for reports for inclusion by one individual (LC). Decisions regarding inclusion of potential studies were based on the following criteria: (1) consisted solely of human subjects, (2) random subject allocation, (3) the experimental design included both treatment and control groups comparing an orthopaedic intervention, (4) and had a publication date between January 2006 and December 2010 in the journals mentioned above. These criteria were used as a measure of a methodological quality based Cochrane Collaboration’s widely accepted risk of bias tool as well as Modern Epidemiology 3rd Edition risk of bias assessment recommendations. It is important to note there was no formal protocol for this assessment.
The investigators separately and independently extracted data from each study using preformatted Excel (Microsoft, Redmond, WA) spreadsheets. Extracted data included: journal name, journal impact factor, and publication year. All included studies were assessed on ten criteria related to risk of bias (Table 1). The ten criteria required sufficient reporting regarding randomization method, allocation sequence concealment, participant blinding, outcome assessor blinding, outcome measurement, interventionist training, withdrawals, intent to treat analyses, clustering, and baseline characteristics. For each of these criteria the RCT was judged as fulfilling each criterion (indicated as a “Yes”), not fulfilling it (indicated as a “No”) or having insufficient information to determine fulfillment (“Not Reported”) (see Figure 1). In order to be considered a “Yes” the paper must have included a complete description regarding the process and outcome of each criterion. If investigators felt that there was too little information or that they would be unable to replicate the process based on unclear reporting, the article was designated as a “Not Reported” for that criterion. A complete lack of reporting or an erroneous method (i.e., Randomization by patient number or date of birth) was marked as “No.” Disagreements were documented and resolved by discussion between data collectors along with the primary investigator.
Statistical analyses included calculating the mean number of criteria that were met (“Yes), not met (“No”), or of unknown fulfillment (“Not Reported”) within and across all journals. First, we assessed the distribution of Yes/No/NotReported of each article. Then we calculated the mean proportion of fulfilled items for all the articles from the same journals stratified by criterion (Table 2). We then compared these mean proportions across journals using an analysis of variance (ANOVA) to test for differences in reporting quality. To note, the more favorable distribution is one with a greater proportion of fulfilled items, indicating that the journal has met more criteria for methodological quality. Linear regression was also applied with the outcome variable being the total number of fulfilled items per trial and predictor variables being journal impact factor and year of publication. We also performed a sub-analysis on the proportion of met criteria as categorized by geographic location, anatomical region, study size and orthopedic specialty (see Tables 3, 4, 5 and 6). All statistical tests had significance set at p = 0.05.
We identified a total of 261 RCTs of which 232 met out inclusion criteria. The most common reason for exclusion was the lack of human participants in the RCTs (N = 29). JBJS Am accounted for the largest number of included RCTs (N = 106) followed by AJSM (N = 74), OC (N = 36), SJ (N = 16) and JOR (N = 7). A total of 49% of the criteria were fulfilled across these journals, with 42% of the criteria not being amendable to assessment due to inadequate reporting (Table 7). The RCTs from AJSM had the highest number of fulfilled criteria, or were at the lowest risk of bias, while RCTs from SJ and JBJS Am had the highest number of unfulfilled criteria, and JOR had the largest number of unknown fulfillment of criteria (Table 7). Less than 1% of the included RCTs fulfilled all ten methodological criteria. Results of the ANOVA test revealed that the difference in proportion of items fulfilled (“Yes”) between studies was statistically significant (p = 0.034) at alpha = 0.05 level.
OC had the largest proportion of “yes” ratings, or adequate fulfillment, for four of the ten criteria (proper analysis, description of withdrawals/ compliance, subject blinding, outcome assessor blinding), JBJS Am was the leader for three criteria (randomization process, allocation concealment, accounting for clustering), AJSM led for two criteria (baseline characteristics, intervention administration) and JOR led in one category (blinded outcome assessment). Table 2 contains the complete list of all methodological quality criteria ratings within and across journals.
We also found that the total number of RCTs published increased slightly from 54 in 2006 to 61 in 2008 but fell to 57 and 46 in 2009 and 2010, respectively. But, the proportion of RCTs per total published articles fell from 6% in 2006 to 4% in 2010. The results of our regression revealed that the year of publication was significantly associated with more fulfilled criteria (β = 0.171; CI = −0.00 to 0.342; p = 0.051), but the impact factor was not a significant predictor (β = 0.307; CI = −0.221 to 0.836; p = 0.253). Figure 2 contains the ratings across all criteria by year of publication.
We found that only a very small proportion of the analyzed RCTs met all ten methodological quality criteria, indicating that many of these studies are at a serious risk of bias, but that these trials are improving with time (Figure 2). In addition, we found that many RCTs did not report sufficient information to judge if they met many of the included criteria. Overall, it is clear that the methodological and reporting quality in orthopaedic RCTs has significant room for improvement.
The poor methodological quality of orthopaedic RCTs has been shown in previous literature . Dulai et al.  reported that despite increasing numbers of RCTs, only 19% of pediatric orthopaedic trials evaluated met the standard for methodological acceptability. They found that in particular there was inadequate rigor and reporting of randomization methods, use of inappropriate or poorly described outcome measures, inadequate description of inclusion and exclusion criteria, and inappropriate statistical analysis. In another study, Bhandari and colleagues  assessed 72 RCTs from JBJS Am published from January 1988 to the end of 2000 and found that while the number of RCTs increased over the years, their mean overall score was only 68.1% of the total Detsky quality score. Similar to our study, they found that more than half of the RCTs were limited by lack of concealed allocation, lack of blinding of outcome assessors, and failure to report reasons for excluding patients . Furthermore, Herman and colleagues  found that only 35% of the RCTs in eight leading orthopaedic journals used an intention-to-treat analysis, which was similar to our finding of 41%. Also, Karanicolas and colleagues  found that less than 10% of 171 included orthopaedic trauma RCTs had blinded outcome assessors. This is much lower than our nearly 51% finding, the difference of which is most likely due to the broader nature of the trials that we included, going beyond trauma and including any orthopaedic RCTs from a select list of orthopaedic journals.
Beyond methodological deficits in these trials some evidence suggests, similar to our findings, that RCTs in orthopaedic surgery fail to report much important information [16, 18]. That is, to adequately assess the quality of any methodological component of an RCT, sufficient information must be present in the published report to make that assessment, and it appears many orthopaedic RCTs fall short in this regard. For example, the most recent of these investigations of reporting quality  applied the Consolidated Standards of Reporting Trials (CONSORT) statement  to a sample of RCTs, the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement  to a sample of case–control, cohort and cross-sectional studies, and a statistical questionnaire was used to assess all included studies. They found that for the 100 included studies only 58% of the CONSORT items were met on average across the seven included journals. We found inadequate reporting on average for approximately 42% of the items on which the RCTs were assessed. The slight difference in findings between these studies can likely be accounted for by the use of different checklist items and the inclusion of different selections of journals. Either way, research in the area indicates serious inadequacies in reporting in orthopaedic RCTs. This trend of poor reporting has been seen in other fields as well, including internal medicine  and general surgery .
Despite the deficits, the RCTs we included did have some common strengths. In general, the intervention and primary outcome was well described in most papers. Also, the proportion of methodological quality items fulfilled increased with increasing publication year, which is consistent with trends in internal medicine journals . This is promising and may suggest that clinical trialists, editors and reviewers are putting more emphasis on proper methodology.
Our study has several strengths. First, we conducted a comprehensive hand search of the tables of contents of the top orthopaedic journals in a recent span of 5 years. Thus, the findings presented here for the included RCTs likely represent the most read and cited RCTs in the orthopaedic community and therefore give an excellent idea of the quality of the RCTs that might be impacting clinical decision making. If in fact this assumption is true, the trials that are the most influential are at a high risk of having biased estimates of treatment effect. But, due to the limited selection of journals included, it is possible that higher quality and more influential RCTs are being published in other journals. For example, we found that RCTs make up only a very small proportion of all articles published in these five journals and therefore may not be influencing decision making to any high degree. In order to ensure a proper meta-analysis, our paper is in accordance with the PRISMA Statement and meets all criteria. Additionally, we included methodological quality criteria that have been empirically proven to bias estimates of treatment effect when not properly implemented [3, 4, 23–34]. All included criteria have empirical evidence that not using them in RCTs or not assessing them in systematic reviews results in bias in the estimates of treatment effect or in misclassification of trials as high or low quality. But, due to the lack of reporting of the included studies we could not directly test the influence of specific inadequacies in methodology on effect estimates. Therefore, we cannot be certain that the flaws in methodology in these orthopaedic studies absolutely bias the estimates of treatment effect. We can only extrapolate for the extensive literature that has shown this to be true for RCTs in other clinical areas [3, 4, 23–34].
It is important to note that just because a study did not report a certain methodology does not imply that it was not performed. For example, in this study, subject allocation and cluster analysis had two of the lowest fulfillment proportions. We acknowledge that descriptive reporting of these topics may not have been emphasized despite proper methodology and that poor reporting may not necessarily be a proxy for poor methodology . Thus, this paper fails to account for these underreporting deficiencies and may falsely underestimate the quality of methodology in this literature. In any case, to adequately assess the quality of a reported study the relevant information must be present for the reader to assess the potential risk of bias in the estimates of effect to determine the potential import or not of the RCT to clinical decision-making.
In common with other authors, we can make some recommendations on how to improve this literature. First, we suggest that investigators include on their team an epidemiologist, clinical epidemiologist, clinical trial methodologist or someone with experience in conducting RCTs and a statistician or biostatistician to ensure proper planning and implementation of the trial. There is evidence that including such individuals on the investigative team improves the quality of the resultant RCT . In addition, we would suggest that investigators and authors refer to the revised CONSORT statement  and the related explanatory paper  to guide them on the important information to include when reporting their RCT. The CONSORT statement has been shown to improve the quality of reporting in these studies . In addition to these documents, there are other reporting guidance documents located on the equator network website that may be of use . Finally, we suggest that journal editors enforce the use of the CONSORT statement so that published reports are completely reported and have the best chance of being interpreted properly for clinical decision making.
There are some obvious flaws in the methodology and reporting of RCTs in the orthopaedic literature. These flaws may cause seriously biased estimates of effect in those studies. We expect that these types of initiatives mentioned above will improve these important types of clinical research which are an integral aspect to improving the empirical base for orthopaedic procedures . And remember, just because a study is rated as level I evidence does not imply that it is without methodological flaws and that these flaws can bias the reported effect estimates .
Altman DG, Schulz KF, Moher D, Egger M, Davidoff F, Elbourne D, Gotzsche PC, Lang T: The revised CONSORT statement for reporting randomized trials: explanation and elaboration. Ann Intern Med. 2001, 134: 663-694. 10.7326/0003-4819-134-8-200104170-00012.
Mulrow CD, Cook DJ, Davidoff F: Systematic reviews: critical links in the great chain of evidence. Systematic reviews: synthesis of best evidence for health-care decisions. Edited by: Mulrow C, Cook D. 1998, Philadelphia, PA: American College of Physicians
Sackett DL, Richardsom WS, Rosenberg W, Haynes B: Evidence-based medicine: how to practice and teach EBM. 1998, New York, NY: Churchill Livingstone
Schulz KF, Chalmers I, Hayes RJ, Altman DG: Empirical evidence of bias: dimensions of methodological quality associated with estimates of treatment effects in controlled trials. JAMA. 1995, 273: 408-412. 10.1001/jama.1995.03520290060030.
Moher D, Jadad AR, Tugwell P: Assessing the quality of randomized controlled trials: current issues and future directions. Int J Technol Assess Health Care. 1996, 12: 195-208. 10.1017/S0266462300009570.
Jadad AR, Moore A, Carroll D, Jenkinson C, Reynold DJ, Gavaghan DJ, McQuay HJ: Assessing the quality of reports of randomized clinical trials: is blinding necessary?. Control Clin Trials. 1996, 17: 1-12. 10.1016/0197-2456(95)00134-4.
Begg CB, Cho MK, Eastwood S, Horton R, Moher D, Olkin I, Pitkin R, Rennie D, Schulz KF, Simel D, Stroup DF: Improving the quality of reporting of randomized controlled trials: the CONSORT statement. JAMA. 1996, 76: 637-639.
Moher D, Schulz KF, Altman D, for the CONSORT Group (Consolidated Standards of Reporting Trials): The CONSORT statement: revised recommendations for improving the quality of reports of parallel-group randomized trials. JAMA. 2001, 285: 1987-1991. 10.1001/jama.285.15.1987.
The CONSORT group: [http://www.consort-statement.org] (accessed 1 Nov 2004)
Ioannidis JPA, Evans SJW, Gøtzsche PC, O’Neill RT, Altman DG, Schulz K, Moher D, for the CONSORT Group: Better reporting of harms in randomized trials: an extension of the CONSORT statement. Ann Int Med. 2004, 141: 781-788. 10.7326/0003-4819-141-10-200411160-00009.
The equator network: [http://www.equator-network.org/]. Accessed 18 April 2012
Dulai SK, Slobogean BL, Beauchamp RD, Mulpuri K: A quality assessment of randomized clinical trials in pediatric orthopaedics. J Pediatr Orthop. 2007, 27 (5): 573-581. 10.1097/bpo.0b013e3180621f3e.
Bhandari M, Richards RR, Sprague S, Schemitsch EH: The quality of reporting of randomized trials in the journal of bone and joint surgery from 1998 through 2000. J Bone Joint Surg Am. 2002, 84A (3): 388-396.
Herman A, Boster IB, Tenenbaum S, Chechick A: Intention-to-treat analysis and accounting for missing data in orthopaedic randomized clinical trials. J Bone Joint Surg Am. 2009, 91: 2137-2143. 10.2106/JBJS.H.01481.
Karanicolas PJ, Bhandari M, Teromi B, Akl EA, Bassler D, Alonso-Colello P, Rigau D, Bryant D, Smith SE, Walter SD, Guyatt GH: Blinding of outcomes in trials of orthopaedic trauma: an opportunity to enhance the validity of clinical trials. J Bone Joint Surg Am. 2008, 90: 1026-1033. 10.2106/JBJS.G.00963.
Chan S, Bhandari M: The quality of reporting of orthopaedic randomized trials with use of a checklist for non-pharmacological therapies. J Bone Joint Surg Am. 2007, 89: 1970-1978. 10.2106/JBJS.F.01591.
Moher D, Jones A, Lepage L, for the CONSORT Group: Use of the CONSORT statement and quality of reports of randomized trials: a comparative before-and-after evaluation. JAMA. 2001, 285 (15): 1992-1995. 10.1001/jama.285.15.1992.
Montane E, Vallano A, Vidal X, Aguilera C, Laporte J: Reporting randomized clinical trials of analgesics after traumatic or orthopaedic surgery is inadequate: a systematic review. BMC Clin Pharmacol. 2010, 10: 2-
Parsons NR, Hiskens R, Price CL, Achten J, Costa ML: A systematic review of the quality of research reporting in general orthopaedic journals. J Bone Joint Surg Br. 2011, 93B: 1154-1159.
Schulz KF, Altman DG, Moher D: CONSORT 2010 statement: updated guidelines for reporting parallel group randomized trials. Ann Intern Med. 2010, 152: 726-732. 10.7326/0003-4819-152-11-201006010-00232.
von Elm E, Altman DG, Egger M, Pocock SJ, Gotzsche PC, Vandenbroucke JP, for the STROBE Initiative: The strengthening the reporting of observational studies in epidemiology (STROBE) statement: guidelines for reporting observational studies. Lancet. 2007, 370: 1453-1457. 10.1016/S0140-6736(07)61602-X.
Balasubramanian SP, Wiener M, Alshameeri Z, Tiruvoipati R, Elbourne D, Reed MW: Standards of reporting of randomized controlled trials in general surgery: can we do better?. Ann Surg. 2006, 244 (5): 663-667. 10.1097/01.sla.0000217640.11224.05.
Juni P, Witschi A, Bloch R, Egger M: The hazards of scoring the quality of clinical trials for meta-analysis. JAMA. 1999, 282 (11): 1054-1060. 10.1001/jama.282.11.1054.
Juni P, Tallon D, Egger M: Proceedings of the 3rd symposium on systematic reviews: beyond the basics. St. Catherine’s College. “Garbage in – garbage out?” Assessment of the quality of controlled trials in meta-analyses published in leading journals. 2000, Oxford: Centre for Statistics in Medicine, 19-
Kjaergard LI, Villumsen J, Gluud C: Reported methodologic quality and discrepancies between large and small randomized trials in meta-analyses. Ann Int Med. 2001, 135 (11): 982-989. 10.7326/0003-4819-135-11-200112040-00010.
Verhagen AP, de Bie R, Lenssen AF, de Vet HC, Kessles AG, Boers M, van den Brandt PA: Impact of quality items on study outcome. Int J Technol Assess Health Care. 2000, 16 (4): 1136-1146. 10.1017/S0266462300103174.
Ioannidis JP, Polycarpou A, Ntais C, Pavlidis N: Randomised trials comparing chemotherapy regimens for advanced non-small cell lung cancer: biases and evolution over time. Eur J Cancer. 2003, 39: 2278-2287. 10.1016/S0959-8049(03)00571-9.
Chalmers TC, Celano P, Sacks HS, Smith H: Bias in treatment assignment in controlled clinical trials. N Engl J Med. 1983, 309: 1358-1361. 10.1056/NEJM198312013092204.
Kunz R, Oxman AD: The unpredictability paradox: review of empirical comparisons of randomised and non-randomised clinical trials. BMJ. 1998, 317 (7167): 1185-1190. 10.1136/bmj.317.7167.1185.
Ioannidis JP, Haidich AB, Pappa M, Pantazis N, Kokori SI, Tektonidou MG, Contopoulos-Ioannidis DG, Lau J: Comparison of evidence of treatment effects in randomized and nonrandomized studies. JAMA. 2001, 286 (7): 821-830. 10.1001/jama.286.7.821.
Karassa FB, Tatsioni A, Ioannidis JP: Design, quality, and bias in randomized controlled trials of systemic lupus erythematosus. J Rheumatol. 2003, 30 (5): 979-984.
Montori VM, Guyatt GH: Intention-to-treat principle. CMAJ. 2001, 165 (10): 1339-1341.
Turk DC, Rudy TE: Neglected factors in chronic pain treatment outcome studies—referral patterns, failure to enter treatment, and attrition. Pain. 1990, 43: 7-25. 10.1016/0304-3959(90)90046-G.
Turk DC, Rudy TE, Sorkin BA: Neglected factors in chronic pain treatment outcome studies: determination of success. Pain. 1993, 53: 3-16. 10.1016/0304-3959(93)90049-U.
Huwiler-Muntener K, Juni P, Junker C, Egger M: Quality of reporting as a measure of methodology quality. JAMA. 2002, 287 (21): 2801-2804. 10.1001/jama.287.21.2801.
Moher D, Hopewell S, Schulz KF, Montori V, Gotzsche PC, Devereaux PJ, Elbourne D, Egger M, Altman DG: CONSORT 2010 explanation and elaboration: updated guidelines for reporting parallel group randomized trials. BMJ. 2010, 340: c869-10.1136/bmj.c869.
Plint AC, Moher D, Morrison A, Schulz K, Altman DG, Hill C, Gaboury I: Does the CONSORT checklist improve the quality of reports of randomised controlled trials? A systematic review. Med J Aust. 2006, 185 (5): 263-267.
McCulloch P, Altman DG, Campbell WB, Flum DR, Glasziou P, Marshal JC, Nichol J, for the Balliol Collaboration: No surgical innovation without evaluation: the IDEAL recommendations. Lancet. 2009, 374: 1105-1112. 10.1016/S0140-6736(09)61116-8.
Poolman RW, Struijs PA, Krips R, Sierevelt IN, Lutz KH, Bhandari M: Does a “level I evidence” rating imply high quality of reporting in orthopaedic randomised controlled trials?. BMC Med Res Methodol. 2006, 6: 44-10.1186/1471-2288-6-44.
The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1471-2288/13/76/prepub
We would like to acknowledge Tom Cichonski for his help in manuscript editing, table formatting and technical assistance in this submission.
The authors declare that they have no competing interest.
JG designed the study, performed statistical analysis, and revised the manuscript. LC carried out the data collection, participated in the statistical analysis, drafted the manuscript and designed the figures/tables. Both authors read and approved the final manuscript.