A systematic review of comparisons between protocols or registrations and full reports in primary biomedical research

Background Prospective study protocols and registrations can play a significant role in reducing incomplete or selective reporting of primary biomedical research, because they are pre-specified blueprints which are available for the evaluation of, and comparison with, full reports. However, inconsistencies between protocols or registrations and full reports have been frequently documented. In this systematic review, which forms part of our series on the state of reporting of primary biomedical, we aimed to survey the existing evidence of inconsistencies between protocols or registrations (i.e., what was planned to be done and/or what was actually done) and full reports (i.e., what was reported in the literature); this was based on findings from systematic reviews and surveys in the literature. Methods Electronic databases, including CINAHL, MEDLINE, Web of Science, and EMBASE, were searched to identify eligible surveys and systematic reviews. Our primary outcome was the level of inconsistency (expressed as a percentage, with higher percentages indicating greater inconsistency) between protocols or registration and full reports. We summarized the findings from the included systematic reviews and surveys qualitatively. Results There were 37 studies (33 surveys and 4 systematic reviews) included in our analyses. Most studies (n = 36) compared protocols or registrations with full reports in clinical trials, while a single survey focused on primary studies of clinical trials and observational research. High inconsistency levels were found in outcome reporting (ranging from 14% to 100%), subgroup reporting (from 12% to 100%), statistical analyses (from 9% to 47%), and other measure comparisons. Some factors, such as outcomes with significant results, sponsorship, type of outcome and disease speciality were reported to be significantly related to inconsistent reporting. Conclusions We found that inconsistent reporting between protocols or registrations and full reports of primary biomedical research is frequent, prevalent and suboptimal. We also identified methodological issues such as the need for consensus on measuring inconsistency across sources for trial reports, and more studies evaluating transparency and reproducibility in reporting all aspects of study design and analysis. A joint effort involving authors, journals, sponsors, regulators and research ethics committees is required to solve this problem.


Background
Incomplete or selective reporting in publications is a serious threat to the validity of findings from primary biomedical research, because inadequate reporting may be subject to bias, and it subsequently impairs evidencebased decision-making [1,2]. Prospective study protocols and registrations can play a significant role in reducing incomplete or selective reporting, because they are pre-specified blueprints which are available for the evaluation of, and comparison with, full reports [3,4]. Therefore, for instance, in 2004 the International Committee of Medical Journal Editors (ICMJE) stated that all trials must be included in a trial registry before participant enrollment as a compulsory condition of publication [5], because registry records may include information on either what was planned or what was done during a study. Moreover, one recent study reported that primary outcomes were more consistently reported when a trial had been prospectively registered [6]. With wide acceptance of trial registration, many journals started establishing editorial policies to publish protocols. However, inconsistency was found to be strikingly frequent after comparing protocols or registrations with full reports regarding outcome reporting, subgroup selection, sample size, statistical analysis, among others [7][8][9][10][11]. In this systematic review, which forms part of our series on the state of reporting of primary biomedical research, our objectives were to map the existing evidence of inconsistency between protocols or registrations (i.e., what was planned to be done and/or what was actually done) and full reports (i.e., what was reported in the literature), and to provide recommendations to mitigate such inconsistent reporting, based on findings from systematic reviews and surveys in the literature [12].

Methods
We followed the guidance from the Joanna Briggs Institute [13] and/or the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) [14] to conduct and report our review. Details on the methods have been published in our protocol [12].

Eligibility and search strategy
In brief, in this systematic review, we included systematic reviews or surveys that focused on inconsistent reporting when comparing protocols or registration with full reports. A study protocol is defined as the original research plan with comprehensive description of study participants or subjects, outcomes, objective(s), design, methodology, statistical consideration and other related information that cannot be influenced by the subsequent study results. In this review, we defined full reports as the publications that included findings of any of the study key elements, including participants or subjects, interventions or exposures, controls, outcomes, time frames, study designs, analyses, result interpretations and conclusions, and other study-related information, and that had been published after completion of the studies. Therefore full reports may include full-length articles, research letters, or other published reports without peer review. An eligible systematic review was defined as a study that assessed the comparisons between protocols or registrations and full reports, and that had predefined objectives, specified eligibility criteria, at least one database searched, data extraction and analyses, and at least one study included. All the surveys that included primary studies and that compared protocols or registrations and full reports were eligible for inclusion.
Exclusion criteria were: 1) the study was not a systematic review or survey, 2) the study objective did not include comparison between protocols or registration with full reports, 3) the study could not provide data on such comparisons, 4) the study did not focus on primary biomedical studies, or 5) the study was in duplicate. The search process was completed by one reviewer (GL) with the help of an experienced librarian. It was limited to several databases (CINAHL, EMBASE, Web of Science, and MEDLINE) from 1996 to September 30th 2016, restricted to studies in English. Two reviewers (YJ and IN) independently screened the records retrieved from the search. Reference lists from the included studies were also searched by hand in duplicate by the two reviewers (YJ and IN), to avoid the omission of potentially eligible systematic reviews and surveys. The kappa statistic was used to assess the agreement level between the two reviewers [15].

Outcome and data collection
Our primary outcome was the percentage of primary studies in the included systematic reviews and surveys for which an inconsistency was observed between the protocol or registration and the full report, with higher percentages indicating greater inconsistency [12]. Inconsistencies were recorded between protocols or registrations and full reports with respect to study participants or subjects, interventions or exposures, controls, outcomes, time frames, study designs, analyses, result interpretations and conclusions, and other study-related information. A secondary outcome was the factors reported to be significantly associated with the inconsistency between protocols or registration and full reports.
Two independent reviewers (IN and LA) extracted the data from the included studies. Data collected included the general characteristics of the systematic reviews or surveys (author, year of publication, journal, study area, data sources, search frame, numbers and study designs of included primary studies for each systematic review or survey, measure of comparison, country and sample size of primary studies, and funding information), key findings of inconsistent reporting, authors' conclusions, and the factors reported to be significantly related to inconsistent reporting. The terminologies and their frequency used in the included systematic reviews and surveys to describe the reporting problem were also collected.

Quality assessment and data analyses
Study quality was assessed for the included systematic reviews using the AMSTAR (a measurement tool to assess systematic reviews) [16]; no comparable assessment tool was available for surveys. We excluded two items of the AMSTAR (item 9 "Were the methods used to combine the findings of studies appropriate?" and 10 "Was the likelihood of publication bias assessed?") because they were not relevant to the included systematic reviews.
Inconsistency was analysed descriptively using medians and interquartile ranges (IQRs). Frequencies of the terminologies that were used to describe the inconsistent reporting problem and that were extracted from the included systematic reviews and surveys were calculated and shown by using word clouds. The word clouds were generated using the online program Wordle (www.wordle.net) with the input of the terminologies and their frequencies. The relative size of the terms in the word clouds corresponded to the frequency of their use. We summarized findings from the included systematic reviews and surveys qualitatively. No pooled analyses were performed in this review.

Results
A total of 9123 records were retrieved. After removing duplicates, 8080 records were screened through their titles and abstracts. There were 108 studies accessed for full-text article evaluation (kappa = 0.81, 95% confidence interval: 0.75-0.86). We included 37 studies (33 surveys and 4 systematic reviews) for analysis [7][8][9][10][11]. Fig. 1 shows the study inclusion process. . Eight studies collected protocol data from grant or ethics applications, two from FDA (Food and Drug Administration) reviews, two from internal company documents, one from a journal's website, and nine from other sources, respectively. Regarding the data sources for full reports, most studies (n = 35) searched databases and/or journal websites to access full reports, while one study collected full reports by searching databases and contacting investigators [17] and another study by contacting the lead investigators only [20]. Most systematic reviews or surveys (n = 36) compared protocols or registrations with full reports in clinical trials; there was only one included survey that investigated inconsistent reporting in both clinical trials and observational research [18]. Measures of comparison between protocols or registrations and full reports included outcome reporting, subgroup reporting, statistical analyses, sample size, participant inclusion criteria, randomization, and funding, among others (Table 1). Most primary studies were conducted in North America and Europe. Among the included systematic reviews and surveys that reported information on sample sizes for the primary studies, the median sample sizes in the primary studies ranged from 16 to 463. There were 15 studies that had received academic funding for their conduct of a systematic review or survey, and 2 studies that had received litigationrelated consultant fees.
We assessed study quality for the four systematic reviews using AMSTAR [23,30,42,47]. None of them had assessed the quality of their included primary studies, thus receiving no points for items 7 ("Was the scientific quality of the included studies assessed and documented?") and 8 ("Was the scientific quality of the included studies used appropriately in formulating conclusions?"). One review scored 5 (out of 9) on AMSTAR, because it did not provided information on duplicate data collection (AMSTAR item 2) or show the list of included and excluded primary studies (item 5) [23]. No indication of a grey literature search (item 4) was found in one systematic review [42], resulting in its score of 6 (out of 9) on AMSTAR.
Among all the 37 included studies, the terminologies most frequently used to describe the reporting problem included selective reporting (n = 35, 95%), discrepancy (n = 31, 84%), inconsistency (n = 27, 73%), biased reporting (n = 15, 41%), and incomplete reporting (n = 13, 35%). Fig. 2 shows the word clouds of all the terminologies used in the included systematic reviews and surveys. Table 2 presents the key findings and authors' conclusions of inconsistency by their main measure of comparison between protocols or registrations and full reports. Table 3 presented the detailed information of what had been reported in the 37 included studies regarding the inconsistency between protocols or registrations and full reports. There were 17 studies with a focus of outcome reporting problems, including changing, omitting (or unreported), introducing, incompletely-reporting, and selectively-reporting outcomes. The median inconsistency of outcome reporting was 54% (IQR: 29% -72%), ranging from 14% (22/155) to 100% (1/1 and 69/69). Six studies found that most inconsistencies (median 71%, IQR: 57% -83%) favoured a statistically significant result in full reports [21,24,27,38,43,45]. Regarding subgroup reporting, inconsistency levels between protocols or registrations and full reports varied from 38% (196/515) to 100% (6/6), with post hoc analyses introduced (ranging from 26% (132/515) to 76% (143/189)) and pre-specified analyses omitted in full reports (from 12% (64/515) to 69% (103/149)). Inconsistencies of statistical analyses were observed, including defining non-inferiority margins, analysis principle selection (intention-to-treat, perprotocol, as-treated), and model adjustment, with an inconsistency level varying from 9% (5/54) to 67% (2/3). The remaining 13 studies reported frequent inconsistencies in multiple measure comparisons, where the multiple measure comparisons were defined as at least two main measures used for comparison between protocols or registrations and full reports (Tables 2 and 3). For instance, inconsistencies were observed in sample sizes (ranged from 27% (14/51) to 60% (34/56)), inclusion or exclusion criteria (from 12% (19/153) to 45% (9/20)), and conclusions (9%, 9/99).  As shown in Table 4, significant factors reported to be related to inconsistent reporting included outcomes with statistically significant results, study sponsorship, type of outcome (efficacy, harm outcome) and disease specialty. Two studies reported higher odds of complete reporting for full reports in primary outcomes with significant results (odds ratios (ORs) ranging from 2.5 to 4.7) [7,19], while one study found that outcomes with significant results were associated with inconsistent reporting in full reports (OR = 1.38) [34].Other factors related to inconsistent reporting included investigator-sponsored trials, efficacy outcomes, and cardiology and infectious diseases (Table 4).

Discussion
We have presented the mapping of evidence of inconsistent reporting between protocols or registrations (i.e., what was planned to be done and/or what was actually done) and full reports (i.e., what was reported in the literature) in primary biomedical research, based on findings from systematic reviews and surveys in the literature. High levels of inconsistency were found across various areas in biomedicine and in different study aspects, including outcome reporting, subgroup reporting, statistical analyses, and others. Some factors such as outcomes with significant results, sponsorship, type of outcome and disease speciality were reported to be significantly related with inconsistent reporting.
The ICMJE statement that requires all trials to be registered prospectively has been implemented since 2004 [5]. Likewise, the SPIRIT (Standard Protocol Items: Recommendations for Interventional Trials) statement aims to assist in transparent reporting and improve the quality of protocols [49]. However, inconsistent reporting between protocols or registrations and full reports remains a severe problem. In this review, all the included studies revealed that the inconsistent reporting between protocols or registrations and full reports was highly prevalent, common and suboptimal. Inconsistent reporting may impair the evidence's reliability and validity in the literature, potentially resulting in evidence-biased syntheses [29,50,51] and inaccurate decision-making, especially given that most inconsistencies were found to favor statistically significant results ( Table 2).One study searched full reports in gastroenterology and hepatology journals published from 2009 to 2012, and concluded that the inconsistent reporting problem had improved; however, there might have been sampling bias involved in reaching this conclusion as it indicated [26]. More evidence to assess the trend of inconsistent reporting, and more efforts to mitigate it, are needed for the primary biomedical community.
We found that the majority of the evidence for inconsistent reporting between protocols or registrations and full reports came from assessments of outcome reporting. It is not uncommon for authors to change, omit, incompletely-report, selectively-report, or introduce new outcomes in full reports. The main reason was that they attempted to show statistically significant findings using an approach of selective reporting of outcomes to cater to the journal's choices of publications [52][53][54].Therefore outcomes with significant results were more likely to be fully and completely reported, compared with those outcomes with nonsignificant results, as observed in two included surveys [7,19]. By contrast, another study found that outcomes with significant results were associated with inconsistent reporting [34]. The conflicting findings may be due to their different inclusion criteria (primary outcomes [7,19] vs. all outcomes [34]), and different definitions of inconsistent reporting (defined as primary outcomes changed, introduced, or omitted [7,19] vs. defined as addition, omission, non-specification, or reclassification of primary and secondary outcomes [34]). Thus more research is needed to further explore and clarify the relationship between outcomes with significant results and inconsistent reporting. Similarly, other factors (study sponsorship, type of outcome and disease speciality) should be considered with caution, because their associations with inconsistent reporting were observed in only one survey (Table 4).
We identified several methodological issues in the included studies. Some studies used multiple sources to locate protocol or registration documents and full reports. However, we could not study the heterogeneity in the sources of protocols or registrations and full reports  "Selective reporting of outcomes frequently occurs in publications of high-quality government-funded trials."  (from JAMA) [7] 50% (50/99) and 65% (47/72) trials had at least one incompletely reported efficacy and harm outcome respectively. 62% (51/82) trials had at least one primary outcome changed, omitted, or introduced.
"The reporting of trial outcomes in journals is frequently inadequate to provide sufficient data for interpretation and meta-analysis, is biased to favor statistical significance, and is inconsistent with primary outcomes specified in trial protocols. These deficiencies in outcome reporting pose a threat to the reliability of the randomized trial literature." Hannink, 2013 [21] 49% (75/152) showed some discrepancies in outcomes, most related to introducing or omitting a primary outcome. 28% (21/75) of theses discrepancies favored statistically significant results.
"Comparison of the primary outcomes of surgical RCTs registered with their subsequent publication indicated that selective outcome reporting is (highly) prevalent and appears to be more common in surgical trials than in general medical trials." "Reporting discrepancies between the ClinicalTrials.gov results database and matching publications are common. Which source contains the more accurate account of results is unclear, although ClinicalTrials.gov may provide a more comprehensive description of adverse events than the publication." Killeen, 2014 [24] 29% (32/108) registered trials had a discrepancy of primary outcomes between registrations and full reports. 92% of the discrepancies in primary outcomes (in 22 out of 24 full reports) favored a statistically significant finding.
"Less than half of all RCTs published in general surgical journals were adequately registered, and approximately 30% had discrepancies in the registered and published primary outcome with 90% of those assessable favoring a statistically positive result." Li, 2013 [26] 14% (22/155) RCTs had discrepancies in primary outcomes between registrations and full reports.
"Based on the results of the present study, selective outcome reporting of gastroenterology RCTs published in leading medical journals has been much improved over the past years. However, there might be a sampling bias to say that consistency of registered and published POs of gastroenterology RCTs has been better than before." Mathieu, 2009 [27] 31% (46/147) full reports had discrepancies in outcomes compared with registrations. 83% of the discrepancies (in 19 out of 23 full reports) favored a statistically significant result.
"Comparison of the primary outcomes of RCTs registered with their subsequent publication indicated that selective outcome reporting is prevalent." Milette, 2011 [31] 21% (13/63) full reports were registered; only one trial (8%, out of 13) could provide sufficient information to compare full reports with registration for outcomes, and discrepancies (100%) in outcome was found in the study.
"Greater attention to outcome reporting and trial registration by researchers, peer reviewers, and journal editors will increase the likelihood that effective behavioral health interventions are readily identified and made available to patients." Nankervis, 2012 [32] 17% (18/109) full reports were properly registered. 72% (13/18) full reports had inconsistencies compared with registrations.
"Adequate trial registration for eczema RCTs is poor. Registration of all trials in a publicly accessible database is a critical step toward ensuring the transparent reporting of clinical trial results that affect health care." Redmond, 2013 [34] 29% outcomes (870/2966) reported inconsistently. 7% (19/274) primary outcomes in protocols not reported in full reports; 10% (30/288) primary outcomes reported in full reports but not found in protocols. 19% (284/1495) secondary outcomes in protocols not reported in full reports; 14% (334/2375) secondary outcomes reported in full reports but not found in protocols.
"Discrepant reporting was associated with statistical significance of results, type of outcome, and specialty area. Trial protocols should be made freely available, and the publications should describe and justify any changes made to protocol-defined outcomes." Riehm, 2015 [35] Only 3 out of 40 studies were registered; discrepant outcomes were found in 1 of these 3 studies (33%).
"The quality of published outcome declarations and trial registrations remains largely inadequate. Greater attention to trial registration and outcome definition in published reports is needed." Rongen, 2016 [38] 25% (90/362) full reports were registered. "Although trial registration is now the rule, it is currently far from optimal for orthopaedic surgical RCTs and selective outcome reporting is prevalent. Full involvement of authors, editors, and 100% (69/69) full reports had discrepancies in primary outcome specifications (POS). 30% (21/69) full reports had unambiguous POS discrepancies, with significantly higher percentages of non-industry-sponsored than industry-sponsored full reports having unambiguous POS discrepancies.
"At best, POS discrepancies may be attributable to insufficient registry requirements, carelessness (eg, failing to report PO assessment timing), or difficulty uploading registry information. At worst, discrepancies could indicate investigator impropriety (eg, registering imprecise PO ["pain"], then publishing whichever pain assessment produced statistically significant results). Improvements in PO registration, as well as journal policies requiring consistency between registered and published PO descriptions, are needed." Su, 2015 [43] 19% (17/88) full reports were registered. 45% (32/71) full reports had inconsistency of primary outcomes; 71% (15/21) had discrepancies in primary outcomes that favored significant findings.
"We find that prospective registration for randomized clinical trials on acupuncture is insufficient, selective outcome reporting is prevalent, and the change of primary outcomes is intended to favor statistical significance. These discrepancies in outcome reporting may lead to biased and misleading results of randomized clinical trials on acupuncture. To ensure publication of reliable and unbiased results, further promotion and implementation of trial registration are still needed." Vedula, 2009 [45] 67% (8/12) full reports reported primary outcomes differently from internal company documents. Primary outcomes in internal company documents with nonsignificant results were either unreported in full reports, or were reported with a changed outcome measure.
"Selective outcome reporting was identified for trials of off-label use of gabapentin. This practice threatens the validity of evidence for the effectiveness of off-label interventions." Vera-Badillo 2013 [46] 18% (30/164) full reports were registered; of these, 23% (7/30) had a changed primary outcome measure compared with registrations.
"Bias in the reporting of efficacy and toxicity remains prevalent. Clinicians, reviewers, journal editors and regulators should apply a critical eye to trial reports and be wary of the possibility of biased reporting. Guidelines are necessary to improve the reporting of both efficacy and toxicity." You, 2012 [47] 14% (19/134) full reports had inconsistency in primary end points (PEPs) compared with registrations.
"The rates of trial registration and of trials with clearly defined PEPs have improved over time; however, 14% of these trials reported a different PEP in the final publication. Intrapublication inconsistencies in PEP reporting are frequent."
"There is a large discrepancy between the grant applications and the final publications regarding subgroup analyses. Both nonreporting prespecified subgroup analyses and reporting post-hoc subgroup analyses are common. More guidance is clearly needed." Hernandez, 2005 100% (6/6) full reports had discrepancies in subgroup analyses from protocols.
"The reported covariate adjustment and subgroup analyses from TBI trials had several methodological shortcomings. Appropriate performance and reporting of covariate adjustment and subgroup analysis should be considerably improved in future TBI trials because interpretation of treatment benefits may be misleading otherwise." Kasenda, 2014 [9] 26% (132/515) trials reported the subgroup analyses that were not mentioned in their protocols. 12% (64/515) trials did not reported subgroup analyses that were planned in their protocols.
"Large discrepancies exist between the planning and reporting of subgroup analyses in RCTs. Published statements about subgroup prespecification were not supported by study protocols in about a third of cases. Our results highlight the importance of enhancing the completeness and accuracy of protocols of RCTs and their accessibility to journal editors, reviewers, and readers."
"The reporting of noninferiority margins was incomplete and inconsistent with study protocols in a substantial proportion of published trials, and margins were rarely reported in trial registries." 98% (41/42) documents submitted to regulatory authority provided two or more analyses (intention-to-treat, and per-protocol analysis). 7% (2/28) full reports based on a single trial (stand alone publications) provided an intention-to-treat as well as per-protocol analysis; the remaining stand alone publications (93%, 26/28) only provided one analysis that tended to be per-protocol analysis. 20 full reports (15 stand alone publications, and 5 pooled publications that were based on two or more trials) showed difference in participant response rates compared with documents submitted to regulatory authority.
"The degree of multiple publication, selective publication, and selective reporting differed between products. Thus, any attempt to recommend a specific selective serotonin reuptake inhibitor from the publicly available data only is likely to be based on biased evidence." Saquib, 2013 [41] 6% (9/162) trials had statistical analyses such as model adjustments described in registrations, 78% (21/27) in design papers, and 74% (40/54) in protocols obtained from authors. 47% (28/60) full reports had discrepancies in analyses plans compared with registrations, protocols or design papers.
"There is large diversity on whether and how analyses of primary outcomes are adjusted in randomized controlled trials and these choices can sometimes change the nominal significance of the results. Registered protocols should explicitly specify adjustments plans for main outcomes and analysis should follow these plans." Vedula, 2013 [10] Intention-to-treat analyses were defined differently between internal company documents and full reports, resulting in different number of participants in analyses and different results.
"Descriptions of analyses conducted did not agree between internal company documents and what was publicly reported. Internal company documents provide extensive documentation of methods planned and used, and trial findings, and should be publicly accessible.
Reporting standards for RCTs should recommend transparent descriptions and definitions of analyses performed and which study participants are excluded."
"When reported in publications, sample size calculations and statistical methods were often explicitly discrepant with the protocol or not pre-specified. Such amendments were rarely acknowledged in the trial publication. The reliability of trial reports cannot be assessed without having access to the full protocols." Hahn, 2002 [20] 60% (9/15) trials did not state primary outcomes. 47% (7/15) did not mentioned analysis plans. In the 8 trials mentioning analysis plans, 88% (7/8) did not follow the prespecified plans.
"This pilot study has shown that within-study selective reporting may be examined qualitatively by comparing the study report with the study protocol. The results suggest that it might well be substantial; however, the bias can only be broadly identified as protocols are not sufficiently precise." Korevaar, 2014 [25] 32% (49/153) full reports had discrepancies compared with registrations: 12% (19/153) had discrepancies in inclusion criteria; 6% (9/153) in result presentations, and 21% (32/153) in outcomes "Failure to publish and selective reporting are prevalent in test accuracy studies. Their registration should be further promoted among researchers and journal editors." Maund, 2014 [28] Minor inconsistencies in population in the primary efficacy analysis found in one trial (out of 7) between protocol and full report and within the full report. Incomplete reporting of adverse events found in full reports.
"Clinical study reports contained extensive data on major harms that were not available in journal articles and in trial registry reports. There were minor inconsistencies in primary efficacy analysis population between protocols and clinical study reports and within clinical study reports. There were also inconsistencies between different summaries and tabulations of harms data within clinical study reports. Clinical study reports should be used as the data source for systematic reviews of drugs, but they should first be checked against protocols and within themselves for accuracy and consistency." Mhaskar, 2012 [30] Overall methodological quality reporting in full reports was poor and did not reflect actual high quality in protocols. " The largest study to date shows that poor quality of reporting does not reflect the actual high methodological quality. Assessment of the impact of quality on the effect size based on reported quality can produce misleading results." Norris, 2014 [33] "The SOR and SAR were frequent in this pilot study, and the most common type of SOR was the publication of outcomes that were not used for comparisons, because the sources used were substantially various and some of them could not be publicly accessible. Furthermore, although registrations are publicly available, they usually contain incomplete study information [55]. Protocols can provide more transparent and comprehensive details, but they are often not publicly accessible. This is a major limitation that may make it harder to reproduce the findings and conclusions of comparing protocols or registrations and full reports. Likewise, the definitions of inconsistencies and measures of the level of inconsistencies were not fully and explicitly described in the included studies, potentially impacting the reproducibility and validation of the evaluations. This challenge is further exacerbated by pre-specified. Trial registries were of little use in identifying SOR and of no use in identifying SAR." Rising, 2008 [36] 41 primary outcomes from FDA reviews of applications were omitted from full reports; 15 outcomes were added in full reports that favored the drug tested. 43 outcomes in FDA reviews that did not favor the drug tested; of these, 20 (47%) were omitted from full reports; 5 of the remaining 23 outcomes changed in full reports, with 4 (80%, out of 5) changing to favor the drug tested in full reports. 99 conclusions provided in both FDA reviews and full reports; of these, 9% conclusions (9/99) changed from FDA reviews to full reports so that they favored the drug tested in full reports.
"Discrepancies between the trial information reviewed by the FDA and information found in published trials tended to lead to more favorable presentations of the NDA drugs in the publications. Thus, the information that is readily available in the scientific literature to health care professionals is incomplete and potentially biased." Riveros, 2013 [37] More complete reporting was found in registry than in full reports for selection flow of participants (64% vs 48%), efficacy findings (79% vs 69%), adverse events (73% vs 45%), and serious adverse events (99% vs 63%). 22% (11/51) full reports downgraded primary outcomes (defined by registrations) as secondary; 8% (4/51) completely omitted primary outcomes; 8% (4/51) introduced a new primary outcome, and 10% (5/51) defined primary outcome differently. Few discrepancies in randomization, blinding, intervention and ethical committee approval, and some in sample size and inclusion or exclusion criteria. 45% (23/51) full reports had funding information that was not in registrations.
"When interpreting the results of surgical RCTs, the possibility of selective reporting, and thus outcome reporting bias, has to be kept in mind. For future trials, prospective registration should be strictly respected with the ultimate goal to increase transparency and contribute to high-level evidence reports for optimal patient care in surgery." Soares, 2004 [48] The methodological quality in 56 full reports was worse than in protocols. Only 42% reported allocation concealment (while all protocols achieved allocation concealment); 69% reported intention-to-treat analysis (while 83% protocols did such analysis); 16% reported sample size calculation (while 76% protocols did so); 10% reported endpoints and errors (while 76% and 74% protocols defined endpoints and errors respectively).
"The reporting of methodological aspects of RCTs does not necessarily reflect the conduct of the trial. Reviewing research protocols and contacting trialists for more information may improve quality assessment." Turner, 2012 [44] 17% FDA-registered trials not published (4 trials out of 24 applications). 25% (5/20) full reports did not have positive findings Effect size for unpublished trials (0.23) was significantly less than that for published full reports (effect size: 0.47).
"The magnitude of publication bias found for antipsychotics was less than that found previously for antidepressants, possibly because antipsychotics demonstrate superiority to placebo more consistently. Without increased access to regulatory agency data, publication bias will continue to blur distinctions between effective and ineffective drugs." a This study focused on noninferiority margin reporting b Multiple measure comparison defined as at least two main measures used for comparisons, including comparisons of participant, outcome, subgroup, analysis, result, effect size, inclusion criteria, sample size, control, randomization, blinding, intervention, funding, ethics, and/or conclusion reporting    Other inconsistency measures including comparisons of effect size, sample size, control, ethics, key finding reporting, and/or conclusion reporting lack of detailed and transparent reporting of the data collection methods in some studies. For example, some surveys only contacted authors for access to full reports, rather than systematically searching the database(s). Such heterogeneity and disagreements across data sources would potentially affect the statistical significances, effect sizes, interpretations, and conclusions of trial results and their subsequent meta-analyses [56]. Also, there were no explanations provided regarding the inconsistencies found between documents. One study conducted a telephone interview with trialists who were identified to experience inconsistent reporting [57]. It was found that most trialists were not aware of the implications for the evidence base of inconsistent reporting in full trial reports. Thus, providing the researchers with some support to help them recognize the importance of consistent reporting, such as including a list of trial modifications as a journal requirement for submission and offering some training sessions with different inconsistent reporting scenarios that could drive different conclusions, would be a worthwhile endeavour. Taken together, these issues raise the importance of establishing appropriate standards for and consensus on conducting scientific studies aimed at comparing the reporting of key trial aspects in different documents so as to enhance the reproducibility of such comparison studies. There were several systematic reviews assessing cohort studies that compared protocols or registrations and full reports [52,53,58,59]. However, they either focused on outcome reporting [52,53] or statistical analysis reporting [58]; and therefore there was no study summarizing all the inconsistencies between protocols or registrations and full reports in the primary research literature mapping. One Cochrane review published in 2011 included 16 studies and assessed all aspects of inconsistencies throughout the full reports [59]. Our current review included more up-to-date studies and thus provided more information for the biomedical community. Moreover, while all the reviews restricted their inclusion of clinical trials only [52,53,58,59], our review aimed to include all the biomedical areas and map the existing evidence in the overall primary biomedical community. Furthermore, our study identified several methodological issues in the included systematic reviews and surveys regarding the design, conduct and reproducibility, which could assist with the transparent and standardized processes of future comparison studies in this topic.
With a high prevalence of inconsistent reporting highlighted in this review, efforts are needed to reverse this condition by authors, journals, sponsors, regulators and research ethics committees. For instance, authors are expected to fully interpret the necessary modifications made from protocols or registrations, while journal staff and reviewers should refer to protocols or registrations for rigorous scrutiny in peer-review processes. Moreover, the investigators who share their protocols, full reports, and data in public should be rewarded, because this practice can mitigate the inconsistent reporting problem and increase the scientific value of research [54]. For instance, institutions and funders might consider using some performance metrics to provide credits or promotions for the investigators who are willing to share and disseminate their research in public [54]. The impact of ICMJE and SPIRIT statements on inconsistent reporting remains largely unexplored due to sparse evidence available. However, such standards for the protocol or registration reporting should be strictly adopted for all the biomedical areas, because they can provide a platform for easy evaluation of and comparison with full reports. For example, some studies found that prospective registrations for trials were inadequate and incomplete [24,25,32,35,43], leaving the comparison between registrations and full reports questionable and unidentifiable. Therefore the possibility of inconsistent reporting remained largely unknown for those trials with inadequate and incomplete registrations, which would exert an unclear impact on our findings in this review. On the other hand, two studies demonstrated that the methodological quality in full reports was poor and could not reflect the actual high quality in protocols [30,48]. Therefore to improve their quality of reporting and reduce the inconsistent reporting, full reports should rigorously follow the reporting guidelines including the CONSORT (Consolidated Standards of Reporting Trials) for clinical trials, ARRIVE (Animal Research: Reporting In Vivo Experiments) for animal studies, and STROBE (Strengthening the Reporting of Observational Studies in Epidemiology) for observational studies, among others. For instance, one study comparing the quality of trial reporting in 2006 between the CONSORT endorsing and non-endorsing journals found significantly improved reporting quality for the trials published in the CONSORT endorsing journals, especially for the aspect of trial registrations (risk ratio = 5.33; 95% confidence interval: 2.82 to 10.08) [60]. Besides, guidance and/or checklists are needed for authors, editorial staff, reviewers, sponsors, regulators and research ethics committees to advance their easy and prompt assessment of inconsistency between protocols or registrations and full reports. Some limitations exist in this review. We limited our search to English language, which would restrict the generalizability of our findings to the studies in other languages. We did not search the grey literature for unpublished systematic reviews or surveys, which may omit the data from studies that were in progress or yet to be published. We only included one study exploring non-trial research (Table 1); therefore, the inconsistent reporting in non-trial areas remains largely unknown. A possible explanation for this may be that compared to trials, non-trial or observational studies continue to receive less scrutiny in that there is no requirement for their registration, and also there is less emphasis on publication of their protocols. We could not evaluate the quality of surveys due to lack of quality assessment guidance available, which would impair the strength of evidence presented in our review, because most included studies were surveys.

Conclusion
In this systematic review comparing protocols or registrations with full reports, we highlight that inconsistent reporting in different study aspects is frequent, prevalent and suboptimal in primary biomedical research, based on findings from systematic reviews and surveys in the literature. We also identify methodological issues such as the need for consensus on measuring inconsistency across sources for trial reports, and more studies evaluating transparency and reproducibility in reporting all aspects of study design and analysis. Efforts from authors, journals, sponsors, regulators and research ethics committees are urgently required to reverse the inconsistent reporting problem.