Skip to main content

Table 2 Key findings and authors’ conclusions of inconsistency by main measure of comparison between protocols or registrations and full reports in the included studies

From: A systematic review of comparisons between protocols or registrations and full reports in primary biomedical research

First author, year Key findings of inconsistent reporting Authors’ conclusions
Outcome reporting (n = 17)
Chan, 2004 (from CMAJ) [19] 88% (42/48) and 62% (16/26) trials had at least one unreported efficacy and harm outcome respectively.
96% (46/48) and 81% (21/26) trials had incompletely reported efficacy and harm outcome respectively.
“Selective reporting of outcomes frequently occurs in publications of high-quality government-funded trials.”
Chan, 2004 (from JAMA) [7] 50% (50/99) and 65% (47/72) trials had at least one incompletely reported efficacy and harm outcome respectively.
62% (51/82) trials had at least one primary outcome changed, omitted, or introduced.
“The reporting of trial outcomes in journals is frequently inadequate to provide sufficient data for interpretation and meta-analysis, is biased to favor statistical significance, and is inconsistent with primary outcomes specified in trial protocols. These deficiencies in outcome reporting pose a threat to the reliability of the randomized trial literature.”
Hannink, 2013 [21] 49% (75/152) showed some discrepancies in outcomes, most related to introducing or omitting a primary outcome.
28% (21/75) of theses discrepancies favored statistically significant results.
“Comparison of the primary outcomes of surgical RCTs registered with their subsequent publication indicated that selective outcome reporting is (highly) prevalent and appears to be more common in surgical trials than in general medical trials.”
Hartung, 2014 [22] 80% (88/110) trials reported the number of secondary outcome measures inconsistently.
15% (16/110) reported definition of primary outcome measures inconsistently; 20% (22/110) reported results of primary outcome measures inconsistently.
35% (38/110) reporting the number of participants with a serious adverse event (SAE) inconsistently; of these, 87% (33/38) results in reported more SAEs.
“Reporting discrepancies between the results database and matching publications are common. Which source contains the more accurate account of results is unclear, although may provide a more comprehensive description of adverse events than the publication.”
Killeen, 2014 [24] 29% (32/108) registered trials had a discrepancy of primary outcomes between registrations and full reports.
92% of the discrepancies in primary outcomes (in 22 out of 24 full reports) favored a statistically significant finding.
“Less than half of all RCTs published in general surgical journals were adequately registered, and approximately 30% had discrepancies in the registered and published primary outcome with 90% of those assessable favoring a statistically positive result.”
Li, 2013 [26] 14% (22/155) RCTs had discrepancies in primary outcomes between registrations and full reports. “Based on the results of the present study, selective outcome reporting of gastroenterology RCTs published in leading medical journals has been much improved over the past years. However, there might be a sampling bias to say that consistency of registered and published POs of gastroenterology RCTs has been better than before.”
Mathieu, 2009 [27] 31% (46/147) full reports had discrepancies in outcomes compared with registrations.
83% of the discrepancies (in 19 out of 23 full reports) favored a statistically significant result.
“Comparison of the primary outcomes of RCTs registered with their subsequent publication indicated that selective outcome reporting is prevalent.”
Milette, 2011 [31] 21% (13/63) full reports were registered; only one trial (8%, out of 13) could provide sufficient information to compare full reports with registration for outcomes, and discrepancies (100%) in outcome was found in the study. “Greater attention to outcome reporting and trial registration by researchers, peer reviewers, and journal editors will increase the likelihood that effective behavioral health interventions are readily identified and made available to patients.”
Nankervis, 2012 [32] 17% (18/109) full reports were properly registered.
72% (13/18) full reports had inconsistencies compared with registrations.
“Adequate trial registration for eczema RCTs is poor. Registration of all trials in a publicly accessible database is a critical step toward ensuring the transparent reporting of clinical trial results that affect health care.”
Redmond, 2013 [34] 29% outcomes (870/2966) reported inconsistently.
7% (19/274) primary outcomes in protocols not reported in full reports; 10% (30/288) primary outcomes reported in full reports but not found in protocols.
19% (284/1495) secondary outcomes in protocols not reported in full reports; 14% (334/2375) secondary outcomes reported in full reports but not found in protocols.
“Discrepant reporting was associated with statistical significance of results, type of outcome, and specialty area. Trial protocols should be made freely available, and the publications should describe and justify any changes made to protocol-defined outcomes.”
Riehm, 2015 [35] Only 3 out of 40 studies were registered; discrepant outcomes were found in 1 of these 3 studies (33%). “The quality of published outcome declarations and trial registrations remains largely inadequate. Greater attention to trial registration and outcome definition in published reports is needed.”
Rongen, 2016 [38] 25% (90/362) full reports were registered.
54% (14/26) full reports had one or multiple major discrepancies with registrations, 57% (8/14) of which favored statistically significant findings.
“Although trial registration is now the rule, it is currently far from optimal for orthopaedic surgical RCTs and selective outcome reporting is prevalent. Full involvement of authors, editors, and reviewers is necessary to ensure publication of quality, unbiased results.”
Smith, 2013 [42] 100% (69/69) full reports had discrepancies in primary outcome specifications (POS).
30% (21/69) full reports had unambiguous POS discrepancies, with significantly higher percentages of non-industry-sponsored than industry-sponsored full reports having unambiguous POS discrepancies.
“At best, POS discrepancies may be attributable to insufficient registry requirements, carelessness (eg, failing to report PO assessment timing), or difficulty uploading registry information. At worst, discrepancies could indicate investigator impropriety (eg, registering imprecise PO [“pain”], then publishing whichever pain assessment produced statistically significant results). Improvements in PO registration, as well as journal policies requiring consistency between registered and published PO descriptions, are needed.”
Su, 2015 [43] 19% (17/88) full reports were registered.
45% (32/71) full reports had inconsistency of primary outcomes; 71% (15/21) had discrepancies in primary outcomes that favored significant findings.
“We find that prospective registration for randomized clinical trials on acupuncture is insufficient, selective outcome reporting is prevalent, and the change of primary outcomes is intended to favor statistical significance. These discrepancies in outcome reporting may lead to biased and misleading results of randomized clinical trials on acupuncture. To ensure publication of reliable and unbiased results, further promotion and implementation of trial registration are still needed.”
Vedula, 2009 [45] 67% (8/12) full reports reported primary outcomes differently from internal company documents.
Primary outcomes in internal company documents with nonsignificant results were either unreported in full reports, or were reported with a changed outcome measure.
“Selective outcome reporting was identified for trials of off-label use of gabapentin. This practice threatens the validity of evidence for the effectiveness of off-label interventions.”
Vera-Badillo 2013 [46] 18% (30/164) full reports were registered; of these, 23% (7/30) had a changed primary outcome measure compared with registrations. “Bias in the reporting of efficacy and toxicity remains prevalent. Clinicians, reviewers, journal editors and regulators should apply a critical eye to trial reports and be wary of the possibility of biased reporting. Guidelines are necessary to improve the reporting of both efficacy and toxicity.”
You, 2012 [47] 14% (19/134) full reports had inconsistency in primary end points (PEPs) compared with registrations. “The rates of trial registration and of trials with clearly defined PEPs have improved over time; however, 14% of these trials reported a different PEP in the final publication. Intrapublication inconsistencies in PEP reporting are frequent.”
Subgroup reporting (n = 3)
Boonacker, 2011 [18] 75% (59/79) full reports had differences in subgroup analyses from grant applications.
69% prespecified subgroup analyses (103/149) were not reported in full reports.
76% subgroup analyses (143/189) were based on post hoc results.
“There is a large discrepancy between the grant applications and the final publications regarding subgroup analyses. Both nonreporting prespecified subgroup analyses and reporting post-hoc subgroup analyses are common. More guidance is clearly needed.”
Hernandez, 2005 100% (6/6) full reports had discrepancies in subgroup analyses from protocols. “The reported covariate adjustment and subgroup analyses from TBI trials had several methodological shortcomings. Appropriate performance and reporting of covariate adjustment and subgroup analysis should be considerably improved in future TBI trials because interpretation of treatment benefits may be misleading otherwise.”
Kasenda, 2014 [9] 26% (132/515) trials reported the subgroup analyses that were not mentioned in their protocols.
12% (64/515) trials did not reported subgroup analyses that were planned in their protocols.
“Large discrepancies exist between the planning and reporting of subgroup analyses in RCTs. Published statements about subgroup prespecification were not supported by study protocols in about a third of cases. Our results highlight the importance of enhancing the completeness and accuracy of protocols of RCTs and their accessibility to journal editors, reviewers, and readers.”
Statistical analysis reporting (n = 4)
Dekkers, 2015 [11]a Noninferiority margin was inconsistently reported (9%, 5/54 trials), or not reported in the full reports (9%, 5/54), or not defined in the protocol (2%, 1/54).
Reporting of both noninferiority margin and confidence interval (or p-value) was incomplete or inconsistent (28%, 15/54).
54% (29/54) trials were registered, but only one registry record (3%, 1/29) provided information on noninferiority margin.
“The reporting of noninferiority margins was incomplete and inconsistent with study protocols in a substantial proportion of published trials, and margins were rarely reported in trial registries.”
Melander, 2003 [29] 98% (41/42) documents submitted to regulatory authority provided two or more analyses (intention-to-treat, and per-protocol analysis).
7% (2/28) full reports based on a single trial (stand alone publications) provided an intention-to-treat as well as per-protocol analysis; the remaining stand alone publications (93%, 26/28) only provided one analysis that tended to be per-protocol analysis.
20 full reports (15 stand alone publications, and 5 pooled publications that were based on two or more trials) showed difference in participant response rates compared with documents submitted to regulatory authority.
“The degree of multiple publication, selective publication, and selective reporting differed between products. Thus, any attempt to recommend a specific selective serotonin reuptake inhibitor from the publicly available data only is likely to be based on biased evidence.”
Saquib, 2013 [41] 6% (9/162) trials had statistical analyses such as model adjustments described in registrations, 78% (21/27) in design papers, and 74% (40/54) in protocols obtained from authors.
47% (28/60) full reports had discrepancies in analyses plans compared with registrations, protocols or design papers.
“There is large diversity on whether and how analyses of primary outcomes are adjusted in randomized controlled trials and these choices can sometimes change the nominal significance of the results. Registered protocols should explicitly specify adjustments plans for main outcomes and analysis should follow these plans.”
Vedula, 2013 [10] Intention-to-treat analyses were defined differently between internal company documents and full reports, resulting in different number of participants in analyses and different results. “Descriptions of analyses conducted did not agree between internal company documents and what was publicly reported. Internal company documents provide extensive documentation of methods planned and used, and trial findings, and should be publicly accessible. Reporting standards for RCTs should recommend transparent descriptions and definitions of analyses performed and which study participants are excluded.”
Multiple measure comparison b (n = 13)
Al-Marzouki, 2008 [17] 30% (11/37) trials had major discrepancy between protocols and full reports for primary outcomes: 5 had an unreported primary outcome; 8 introduced a new primary outcome; 2 changed a primary outcome to secondary
49% (18/37) trials mentioned subgroup analyses in the protocols; but 76% (28/37) reported subgroup analyses.
Only one protocol (3%) provided reasons for the subgroup choice.
“Although the solution to the problem of selective reporting requires further discussion, the current system is clearly inadequate.”
Chan, 2008 [8] Unacknowledged differences between protocols and full reports were observed in sample size calculation (53%, 18/34 trials), methods of handling protocol deviation (44%, 19/43), addressing missing data (80%, 39/49), primary outcome analyses (60%, 25/42), subgroup analyses (100%, 25/25), adjusted models (82%, 23/28), and interim analyses (62%, 8/13). “When reported in publications, sample size calculations and statistical methods were often explicitly discrepant with the protocol or not pre-specified. Such amendments were rarely acknowledged in the trial publication. The reliability of trial reports cannot be assessed without having access to the full protocols.”
Hahn, 2002 [20] 60% (9/15) trials did not state primary outcomes.
47% (7/15) did not mentioned analysis plans. In the 8 trials mentioning analysis plans, 88% (7/8) did not follow the prespecified plans.
“This pilot study has shown that within-study selective reporting may be examined qualitatively by comparing the study report with the study protocol. The results suggest that it might well be substantial; however, the bias can only be broadly identified as protocols are not sufficiently precise.”
Korevaar, 2014 [25] 32% (49/153) full reports had discrepancies compared with registrations: 12% (19/153) had discrepancies in inclusion criteria; 6% (9/153) in result presentations, and 21% (32/153) in outcomes “Failure to publish and selective reporting are prevalent in test accuracy studies. Their registration should be further promoted among researchers and journal editors.”
Maund, 2014 [28] Minor inconsistencies in population in the primary efficacy analysis found in one trial (out of 7) between protocol and full report and within the full report.
Incomplete reporting of adverse events found in full reports.
“Clinical study reports contained extensive data on major harms that were not available in journal articles and in trial registry reports. There were minor inconsistencies in primary efficacy analysis population between protocols and clinical study reports and within clinical study reports. There were also inconsistencies between different summaries and tabulations of harms data within clinical study reports. Clinical study reports should be used as the data source for systematic reviews of drugs, but they should first be checked against protocols and within themselves for accuracy and consistency.”
Mhaskar, 2012 [30] Overall methodological quality reporting in full reports was poor and did not reflect actual high quality in protocols. “The largest study to date shows that poor quality of reporting does not reflect the actual high methodological quality. Assessment of the impact of quality on the effect size based on reported quality can produce misleading results.”
Norris, 2014 [33] 90% (45/50) full reports had selective outcome reporting (SOR) or selective analysis reporting (SAR) compared with their registrations. “The SOR and SAR were frequent in this pilot study, and the most common type of SOR was the publication of outcomes that were not pre-specified. Trial registries were of little use in identifying SOR and of no use in identifying SAR.”
Rising, 2008 [36] 41 primary outcomes from FDA reviews of applications were omitted from full reports; 15 outcomes were added in full reports that favored the drug tested.
43 outcomes in FDA reviews that did not favor the drug tested; of these, 20 (47%) were omitted from full reports; 5 of the remaining 23 outcomes changed in full reports, with 4 (80%, out of 5) changing to favor the drug tested in full reports.
99 conclusions provided in both FDA reviews and full reports; of these, 9% conclusions (9/99) changed from FDA reviews to full reports so that they favored the drug tested in full reports.
“Discrepancies between the trial information reviewed by the FDA and information found in published trials tended to lead to more favorable presentations of the NDA drugs in the publications. Thus, the information that is readily available in the scientific literature to health care professionals is incomplete and potentially biased.”
Riveros, 2013 [37] More complete reporting was found in registry than in full reports for selection flow of participants (64% vs 48%), efficacy findings (79% vs 69%), adverse events (73% vs 45%), and serious adverse events (99% vs 63%). “Our results highlight the need to search for both unpublished and published trials. Trial results, especially serious adverse events, are more completely reported at than in the published article.”
Rosati, 2016 [39] 95% (19/20) full reports had medium or high combined discrepancy scores comparing registrations.
100% (20/20) full reports selectively reported or unreported main outcomes; 45% (9/20) had discrepancies in disclosing funding, 40% (8/20) in sample size, 45% (9/20) in inclusion or exclusion criteria, 55% (11/20) changed primary outcome to secondary (or vice versa), and 65% (13/20) discontinued early with no justifications in full reports.
“Major discrepancies between what clinical trial registrations record and paediatric RCTs publish raise concern about what clinical trials conclude. Our findings should make clinicians, who rely on RCT results for medical decision-making, aware of dissemination or reporting bias. Trialists need to bring CTR data and reported protocols into line with published data.”
Rosenthal, 2013 [40] 22% (11/51) full reports downgraded primary outcomes (defined by registrations) as secondary; 8% (4/51) completely omitted primary outcomes; 8% (4/51) introduced a new primary outcome, and 10% (5/51) defined primary outcome differently.
Few discrepancies in randomization, blinding, intervention and ethical committee approval, and some in sample size and inclusion or exclusion criteria.
45% (23/51) full reports had funding information that was not in registrations.
“When interpreting the results of surgical RCTs, the possibility of selective reporting, and thus outcome reporting bias, has to be kept in mind. For future trials, prospective registration should be strictly respected with the ultimate goal to increase transparency and contribute to high-level evidence reports for optimal patient care in surgery.”
Soares, 2004 [48] The methodological quality in 56 full reports was worse than in protocols.
Only 42% reported allocation concealment (while all protocols achieved allocation concealment);
69% reported intention-to-treat analysis (while 83% protocols did such analysis);
16% reported sample size calculation (while 76% protocols did so);
10% reported endpoints and errors (while 76% and 74% protocols defined endpoints and errors respectively).
“The reporting of methodological aspects of RCTs does not necessarily reflect the conduct of the trial. Reviewing research protocols and contacting trialists for more information may improve quality assessment.”
Turner, 2012 [44] 17% FDA-registered trials not published (4 trials out of 24 applications).
25% (5/20) full reports did not have positive findings
Effect size for unpublished trials (0.23) was significantly less than that for published full reports (effect size: 0.47).
“The magnitude of publication bias found for antipsychotics was less than that found previously for antidepressants, possibly because antipsychotics demonstrate superiority to placebo more consistently. Without increased access to regulatory agency data, publication bias will continue to blur distinctions between effective and ineffective drugs.”
  1. aThis study focused on noninferiority margin reporting
  2. bMultiple measure comparison defined as at least two main measures used for comparisons, including comparisons of participant, outcome, subgroup, analysis, result, effect size, inclusion criteria, sample size, control, randomization, blinding, intervention, funding, ethics, and/or conclusion reporting