Key findings
In 98 (95%) of the 103 test-treatment RCTs, descriptions of interventions did not mention all the components necessary to characterize test-treatment strategies. Only five trials (5%) provided a description of tests and test methods, treatment methods and decision-making across all study groups, none of which also provided a complete care pathway diagram. We noted that test-treatment interventions for control groups were particularly poorly reported. Descriptions of experimental interventions most often provided details of tests, but less than half gave details of diagnostic and management decision-making, and less than a quarter mentioned which treatments were subsequently used.
In many of these circumstances, failure to describe the test-treatment interventions will make it impossible for clinicians to assess whether the trial is applicable to their practice, nor be able to implement the test-treatment intervention should they choose. The potential implications of such poor reporting is that the time and resources invested in these trials may be largely wasted [37].
Interpretation of findings
Our study is the first to analyze the reporting quality of interventions in RCTs of test-treatment strategies. Other studies evaluating the reporting quality of complex interventions have noted inadequate descriptions of interventions in 87% of back-pain RCTs [38], 59% of surgical treatment trials [5] and 61% of non-pharmacological intervention RCTs [4]. Our finding that test-treatment strategies are inadequately described by 95% of trials suggests that there are additional challenges to describing these interventions. We present three explanations for our findings.
First, many trials have used an approach that solely specifies a new test in the study arm against a control arm of standard care, without providing any further specification of the test-treatment intervention in either arm. Overall 24% (25/103) of trials reported only the tests used in the different arms, while 8 of the 10 trials comparing new tests against a control arm of standard care failed to specify the tests or any subsequent details of management and treatment in the control arm. For example, a trial of routine lumbar X-ray in patients with acute lower back pain (LBP) simply described the interventions as: “In addition to receiving the usual care for patients with LBP, the intervention group patients had lumbar spine radiography at the baseline interview. The control group received the usual care without lumbar spine radiography” [39]. These trials typically neither specified nor recorded how test results were used for decision-making, what pathologies were diagnosed, or what treatments were actually used.
Secondly, test-treatment pathways may be difficult to describe, or even enumerate, due to the myriad of possible downstream actions following a test. We know from trials of complex therapeutic interventions that decision-making processes are very difficult to circumscribe into standardized, rigid protocols, and these problems could account for the poor descriptions we observed of clinical examinations, tele-medical consultations and multiple-test interventions (Table 5). On the other hand we found complex endoscopic techniques were often standardized and well described, despite often being part of multistage diagnostic processes.
Finally, the link between better reporting of experimental and control interventions in a minority of studies suggests there is a lack of awareness amongst trialists and investigators of the level of detail which needs to be included in a trial report. Journal instructions to authors of trial reports have been found lacking, with only 14% (15/106) providing specific directives regarding the reporting of interventions [40]. Thus the need to describe several components of multistage complex interventions is likely to be more poorly recognized.
Requirements for pragmatic RCTs of test-treatment interventions
There is general acceptance that results of pragmatic trials are more applicable to standard practice than explanatory trials [41]. We did not formally assess the position of our trials on the pragmatic explanatory continuum [42], however all the studies we examined were undertaken in the ‘real world’ and evaluated the impact of new testing strategies, alongside current practice, on patient health. We would argue that the notion of ‘explanatory’ trials does not apply well to test-treatment RCTs seeking to evaluate downstream health consequences, but rather best describes studies such as those evaluating diagnostic accuracy (‘does the new test correctly discriminate between diseased and non-diseased patients?’). For a test-treatment RCT, undertaking a pragmatic approach involves recruiting patients as they present in standard care, utilizing tests and treatments as would be provided in the health service, and allowing flexibility to tailor interventions to individual patients as would occur in practice (including allowance of non-compliance, cross-overs and drop-outs). Nevertheless, guidance for pragmatic trials clearly states that the intended interventions in all arms should be defined precisely [43, 44].
Findings from a new test are most likely to be used in an effective way if information on their diagnostic value, and guidance on how they should impact on management, are provided. This is particularly important for new test technologies, where clinicians may be unsure about basing management decisions on new diagnostic information and, in the absence of guidance, could ignore results of the new test or respond to it in inconsistent ways; both would bias studies towards finding no difference.
The absence of complete description of the interventions in the majority of studies might have arisen from poor reporting, but also because the trial protocols may never have specified how test results were to be interpreted and used to determine treatment. Such an approach could have arisen if trialists wrongly considered it to be appropriate in a pragmatic trial design. Other possible explanations include challenges in documenting the components in the test-treatment intervention; concern that specifying a particular care pathway may limit recruitment; or in some circumstances uncertainty around how the results of a test could be best used to determine treatment. Greater preparatory work to fully develop and specify the test-treatment intervention, and obtain buy-in from clinicians involved in the trial, might solve these issues. An RCT of a test strategy that commences before it is determined how test results should be used could ultimately reflect variation in clinician behavior more than the potential value of a diagnostic technology.
There are arguments against specifying control test-treatment strategies in trials whose explicit purpose is to compare outcomes from organized diagnostic services with unstructured care (14 examples were found in our cohort). Since this comparison is of a formalized diagnostic strategy against an approach allowing a clinician to operate without guidance, it is clear that introducing a protocol for decision-making in the control arm would eliminate any effect.
Good pragmatic trials of test-treatment interventions should also report the diagnoses made and treatments undertaken in each arm of the trial, delineated by test results so that one can assess the degree of adherence to the recommended test-treatment protocols. Whilst several studies reported measures of diagnostic and therapeutic impact aggregated in each trial arm, it was very rare for this to be presented according to the test result.
Recommendations
We make three recommendations. First, reporting can be improved by providing guidance for authors. The TIDieR checklist provides a useful tool to assist in describing interventions, but does not explicitly consider complex “staged” interventions such as test-treatment trials. These require further development of Item 9 in TIDieR (handling the tailoring of interventions that are not identical for all participants, which requires descriptions of “how, why, what and when interventions are personalized, titrated or adapted”) [11]. The current tool does not highlight that this should include a full delineation of all management pathways according to test results in test-treatment comparisons. These might best be summarized graphically in a decision-tree which outlines the different sets of test results, the diagnoses which can be made from them, and the possible actions which could occur at each step, as illustrated in Fig. 4. In future, additional detail might be provided in ‘diagnostic intervention manuals’, mirroring complex intervention practice, although such an approach should first be investigated to ensure it provides a useful addition to intervention description.
Second, all trials must aim to prescribe the diagnostic strategies and care pathways that should be followed and describe them in adequate detail to allow replication. Whilst trials evaluating highly flexible interventions allow deviation from planned care pathways as clinically required, the pathways should be specified with as much detail as possible in the study protocol and report. For highly variable diagnostic strategies that are difficult to translate into a prescriptive format, such as clinical consultations, trialists should as a minimum aim to standardize the intended function of the tests [45], by pre-specifying diagnostic goals that can be modified at a local level to suit organizational differences.
Third, reports should describe the care pathways actually followed during the trial, delineating decisions according to the test results and not simply aggregated by study arm. This will be particularly important in pragmatic trials to describe any deviations from the pathway. Embedding process evaluations that measure how interventions are actually administered within a trial [32] may help conceptualize the way in which tests are being used.
We would highlight the need for research on methods of reporting care pathways, variability of care pathways in different settings, and barriers to changing the pathways. Particularly there needs to be an assessment of the information that needs to be reported in a care pathway description to enable its replication in a new setting, and the degree of variability that would be acceptable in pragmatic trials.
Strengths and limitations
Our study is the first to systematically identify an unselected group of test-treatment RCTs, and assess the quality of intervention reporting. Our cohort includes diverse test-treat interventions, conducted in a wide range of medical settings and specialties. Reporting judgements have been made in duplicate using a standardized extraction tool and disagreements discussed.
We chose to assess whether any detail was mentioned, and not to assess whether the description was adequate to allow the intervention to be replicated (as recommended by the TIDieR checklist), or whether specific features (such as care-provider skill and experience [46]) were detailed. This decision was made to ensure our assessments were objective and because of the challenge in identifying experts able to make judgements across the wide range of settings and specialties in the cohort. Even fewer studies than we report are likely to have reported interventions with adequate detail to allow their replication, including information adequate to establish the appropriateness of care-provider skill used.
The scant reporting we have found certainly reflects inadequate reporting of test-treatment interventions, however it is important that future research investigates the extent to which this may be caused by inadequate trial conduct. This could be achieved by contacting trial authors.
Our trials are from a cohort we have previously reported [23, 29] and were published from 2004 to 7, after publication of the CONSORT 2001 statement but before the CONSORT 2010 and SPIRIT guidelines. It is possible that the reporting of some aspects of the methodology of trials has improved over recent years, particularly given the publication of guidelines for reporting and conducting diagnostic accuracy studies; however any such improvement in describing test-treatment interventions is unlikely to be dramatic since neither CONSORT 2010, STARD 2015, nor any other published standards have addressed test-treatment strategies. Since our sample of trials was ascertained from searching CENTRAL, which indexes relevant studies found in MEDLINE, EMBASE and other specialist registers, it is possible that we have missed eligible test-treatment trials not indexed in these resources. It is arguable, however, that such trials would have changed our findings considerably since in order to have escaped detection by our search of the major databases they are likely to have been more poorly reported [29].