Systematic Reviews (SRs) of experimental animal studies are not yet common practice, but awareness of the merits of conducting such SRs is steadily increasing. As animal intervention studies differ from randomized clinical trials (RCT) in many aspects, the methodology for SRs of clinical trials needs to be adapted and optimized for animal intervention studies. The Cochrane Collaboration developed a Risk of Bias (RoB) tool to establish consistency and avoid discrepancies in assessing the methodological quality of RCTs. A similar initiative is warranted in the field of animal experimentation.
We provide an RoB tool for animal intervention studies (SYRCLE’s RoB tool). This tool is based on the Cochrane RoB tool and has been adjusted for aspects of bias that play a specific role in animal intervention studies. To enhance transparency and applicability, we formulated signalling questions to facilitate judgment.
The resulting RoB tool for animal studies contains 10 entries. These entries are related to selection bias, performance bias, detection bias, attrition bias, reporting bias and other biases. Half these items are in agreement with the items in the Cochrane RoB tool. Most of the variations between the two tools are due to differences in design between RCTs and animal studies. Shortcomings in, or unfamiliarity with, specific aspects of experimental design of animal studies compared to clinical studies also play a role.
SYRCLE’s RoB tool is an adapted version of the Cochrane RoB tool. Widespread adoption and implementation of this tool will facilitate and improve critical appraisal of evidence from animal studies. This may subsequently enhance the efficiency of translating animal research into clinical practice and increase awareness of the necessity of improving the methodological quality of animal studies.
The use of systematic reviews (SRs) for making evidenced-based decisions on healthcare is common practice in the clinical setting. Although most experimental animal studies aim to test safety and or efficacy of treatments to be used for human healthcare, summarizing the available evidence in an SR is far less common in the field of laboratory animal experiments. Fortunately, since an influential commentary was published in the Lancet (2002) , first setting out the scientific rationale for SRs of animal studies, awareness of the merits of SRs of experimental animal studies has been steadily increasing . The methodology for conducting SRs of animal intervention studies is currently evolving but not yet as advanced as for clinical studies. In the clinical field, the randomized controlled trial (RCT) is considered the paradigm for evaluating the effectiveness of interventions. Animal intervention studies, like RCTs, are experimental studies, but they differ from RCTs in many respects  (Table 1, supporting information in Additional file 1). This means that some aspects of the systematic review process need to be adapted to the characteristics of animal intervention studies. In this paper, we focus on the methodology for assessing the risk of bias in animal intervention studies.
The extent to which an SR can draw reliable conclusions depends on the validity of the data and the results of the included studies [4–8]. Assessing the risk of bias of the individual studies, therefore, is a key feature of an SR. To assess the risk of bias of RCTs, the Cochrane Collaboration developed the Cochrane RoB Tool . Such a general tool is not yet available for animal intervention studies. The checklists and scales currently used for assessing study validity of animal studies [10–14] vary greatly, are sometimes designed for a specific field (i.e., toxicology) and often assess reporting quality and internal and external validity simultaneously. We believe that, although it is important to asses all aspects of study quality in an SR, the assessment and interpretation of these aspects should be conducted separately. After all, the consequences of poor reporting, methodological quality and generalizability of the results are very different. Here, the SYstematic Review Centre for Laboratory animal Experimentation (SYRCLE) presents an RoB tool for animal intervention studies: SYRCLE’s RoB tool. This tool, based on the Cochrane Collaboration RoB Tool , aims to assess methodological quality and has been adapted to aspects of bias that play a role in animal experiments.
Development of SYRCLE’s RoB tool
The Cochrane RoB Tool was the starting-point for developing an RoB tool for experimental animal studies. The Cochrane RoB Tool assesses the risk of bias of RCTs and addresses the following types of biases: selection bias, performance bias, attrition bias, detection bias and reporting bias . The items in the Cochrane RoB Tool that were directly applicable to animal experiments were adopted (Table 2: items 1, 3, 8, 9 and 10).
To investigate which items in the tool might require adaptation, the differences between randomized clinical trials and animal intervention studies were set out (Table 1). Then we checked whether aspects of animal studies that differed from RCTs could cause bias in ways that had not yet been taken into account in the Cochrane RoB tool. Finally, the quality assessments of recent systematic reviews of experimental animal studies were examined to confirm that all aspects of internal validity had been taken into consideration in SYRCLE’s RoB tool.
To enhance transparency and applicability, we formulated signaling questions (as used in the QUADAS tool, a tool to assess the quality of diagnostic accuracy studies [15, 16]) to facilitate judgment. In order to obtain a preliminary idea of inter-observer agreement for each item in the RoB tool, Kappa statistics were determined on the basis of 1 systematic review including 32 papers.
SYRCLE’s RoB tool
The resulting RoB tool for animal studies contains 10 entries (Table 2). These entries are related to 6 types of bias: selection bias, performance bias, detection bias, attrition bias, reporting bias and other biases. Items 1, 3, 8, 9 and 10 are in agreement with the items in the Cochrane RoB tool. The other items have either been revised or are completely new and will be discussed in greater detail below. Most of the variations between the two tools are a consequence of the differences in design between RCTs and animal studies (see also Table 1). Shortcomings in, or unfamiliarity with, specific aspects of the experimental design of animal studies compared to clinical studies also play a role.
Bias due to inadequate randomization and lack of blinding
Random allocation of animals to the experimental and control groups, firstly, is not yet standard practice in animal experiments . Furthermore, as the sample size of most animal experiments is relatively small, important baseline differences may be present. Therefore, we propose to include the assessment of similarity in baseline characteristics between the experimental and control groups as a standard item. The number and type of baseline characteristics depend on the review question. Before launching a risk of bias assessment, therefore, reviewers need to discuss which baseline characteristics need to be comparable between the groups.
Secondly, we slightly adjusted the sequence allocation item, specifying that the allocation sequence should not only be adequately generated but also be adequately applied. We decided to do so because, in animal studies, diseases are often induced rather than naturally present. The timing of randomization, therefore, is more important than in a patient setting: it needs to be assessed whether the disease was induced before actual randomization and whether the order of inducement was randomly allocated. The signaling questions for judging this entry are represented in Table 3.
Thirdly, a new item pertains to randomizing the housing conditions of animals during the experiment. In animal studies, the investigators are responsible for the way the animals are housed. They determine, for example, the location of the cage in the room. As housing conditions (such as lighting, humidity, temperature, etc.) are known to influence study outcomes (such as certain biochemical parameters and behavior), it is important that the housing of these animals is randomized or, in other words, comparable between the experimental groups in order to reduce bias . Animals from different treatment groups, for example, should not be housed per group on different shelves or in different rooms as the animals on the top shelf experience a higher room temperature than animals on the lowest shelf, and the temperature of the room may influence the toxicity of pharmacological agents (Table 4). When cages are not placed randomly (e.g., when animals are housed per group on different shelves), moreover, it is possible for the investigator to foresee or predict the allocation of the animals to the various groups, which might result in performance bias. Therefore, randomizing the housing conditions is also a requisite for adequately blinding the animal caregivers and investigators. Therefore, this has also been included as a signaling question in Table 3.
Fourthly, in a recent update of the Cochrane RoB tool (http://www.cochrane.org/sites/default/files/uploads/handbook/Whats%20new%20in%20Handbook%205_1_0.pdf), bias related to blinding of participants and personnel (performance bias) is assessed separately from bias related to blinding of outcome assessment (detection bias). In our tool, we followed this approach, although animals do not need to be blinded for the intervention as they do not have any expectations about the intervention. In addition, it is important to emphasize that personnel involved in the experimental animal studies should be taken to include animal caregivers. In animal studies, this group is often not taken into account when blinding the allocation of animals to various groups. If animal caregivers know that a drug might cause epileptic seizures or increases urine production, for example, they might handle the animals or clean the cages in the group receiving this drug more often, which could cause behavioral changes influencing the study results.
With regard to adequately blinding outcome assessment (entry 7), possible differences between the experimental and control groups in methods used for outcome assessment should be described and judged. It should also be determined whether or not animals were selected at random for outcome assessment, regardless of the allocation to the experimental or control group. For instance, when animals are sacrificed per group at various time points during the day, the scientist concerned might interpret the results of the groups differently because she or he can foresee or predict the allocation.
Another reason to select animals at random for outcome assessment is the presence of circadian rhythms in many biological processes (Table 4). Not selecting the animals for outcome assessment at random might influence the direction and magnitude of the effect. For example, the results of a variety of blood tests depend on their timing during the day: cholesterol levels in mice may be much higher in the morning after a meal than in the afternoon. Because of these effects, assessing whether or not animals were selected at random for outcome assessment has also been presented as a separate entry.
As mentioned before, assessing reporting bias is in agreement with the Cochrane RoB tool. It is important to mention, however, that this item is quite difficult to assess in animal intervention studies at present because protocols for animal studies are not yet registered in a central, publicly accessible database. Nevertheless, many have called for registration of all animal experiments at inception [19, 20], so we expect that registration of animal studies will be more common within a few years. For this reason, we already decided to include it in SYRCLE’s RoB tool. Furthermore, protocols of animal studies, like those of clinical studies, can already be published in various (open access) journals, which will also help to improve the standard of research in animal sciences.
Beyond the above-mentioned types of bias, there might be further issues that may raise concerns about the possibility of bias. These issues have been summarized in the other bias domain. The relevance of the signaling questions (Table 3) depends on the experiment. Review authors need to judge for themselves which of the items could cause bias in their results and should be assessed. In assessing entry 10 (“Was the study apparently free of other risks of bias?”), it is important to pay extra attention to the presence of unit-of-analysis errors. In animal studies, the experimental unit is often not clear, and as a consequence statistical measures are often inaccurately calculated. For example, if mice in a cage are given a treatment in their diet, it is the cage of animals rather than the individual animal that is the experimental unit. After all, the mice in the cage cannot have different treatments, and they may be more similar than mice in different cages.
Use of SYRCLE’s RoB tool
In order to assign a judgment of low, high or unclear risk of bias to each item mentioned in the tool, we have produced a detailed list with signaling questions to aid the judgment process (Table 3). It is important to emphasize that this list is not exhaustive. We recommend that people assessing the risk of bias of the included studies discuss and adapt this list to the specific needs of their review in advance. A “yes” judgement indicates a low risk of bias; a “no” judgment indicates high risk of bias; the judgment will be “unclear” if insufficient details have been reported to assess the risk of bias properly.
As a rule, assessments should be done by at least two independent reviewers, and disagreements should be resolved through consensus-oriented discussion or by consulting a third person.
We recommend that risk of bias assessment is presented in a table or figure. The investigators can present either the summary results of the risk of bias assessment or the results of all individual studies. Finally, the results of the risk of bias assessment could be used when interpreting the results of the review or a meta-analysis. For instance, sensitivity analysis can be used to show how the conclusions of the review might be affected if studies with a high risk of bias were excluded from the analysis [8, 9].
We do not recommend calculating a summary score for each individual study when using this tool. A summary score inevitably involves assigning “weights” to specific domains in the tool, and it is difficult to justify the weights assigned. In addition, these weights might differ per outcome and per review.
Inter-observer agreement was evaluated using Kappa statistics. At time of writing, the Kappa statistics could only be determined for items 1, 6, 7, 8, 9 and 10 and was based on 2 raters in one systematic review including 32 papers. For items 1, 6, 7, 8, 9 and 10, the inter-observer variability varied between 0.62 and 1.0. Kappa was for item 1: 0.87; item 6: 0.74; item 7: 0.59; item 8: 1.0; item 9: 0.62; item 10: 1.0. Kappa could not be calculated for items 2, 3, 4, and 5 as Kappa is defined for situations with at least two raters and two outcomes, and in these items we had only 1 outcome (unclear risk of bias) as a result of poor reporting.
Discussion and conclusion
In animal studies, a large variety of tools to assess study quality is currently used, but none of the tools identified so far focussed on internal validity only . Most instruments assess reporting quality and internal and external validity simultaneously although consequences of poor reporting, risk of bias and generalizability of the results are very different.
Therefore, we developed SYRCLE’s RoB tool to establish consistency and avoid discrepancies in assessing risk of bias in SRs of animal intervention studies. SYRCLE’s RoB tool is based on the Cochrane RoB tool  and has been adjusted for particular aspects of bias that play a role in animal intervention studies. All items in our RoB tool can be justified from a theoretical perspective, but not all items have been validated by empirical research. However, the same holds for the original QUADAS tool (to assess the quality of diagnostic accuracy studies) and the Cochrane RoB tool [8, 16]. For example, in the Cochrane RoB tool, the item on “inadequately addressing incomplete outcome data” is mainly driven by theoretical considerations . In QUADAS, no empirical or theoretical evidence was available for 2 out of the 9 risk of bias items .
Although validation is important, providing empirical evidence for all items in this tool is not to be expected in the near future as this would require major comparative studies, which, to our knowledge, are not currently being undertaken or scheduled. Using the existing animal experimental literature is also challenging because the current reporting quality of animal studies is poor ; many details regarding housing conditions or timing outcome assessment are often unreported. However, we feel that publishing this tool is necessary to increase awareness of the importance of improving the internal validity of animal studies and to gather practical experience of authors using this tool.
We started to use this tool in our own SRs and hands-on training courses on conducting SRs in laboratory animal experimentation, funded by The Netherlands Organization for Health Research and Development (ZonMW). The first experiences with this tool were positive, and users found SYRCLE’s RoB tool very useful. The inter-rater variability Kappa varied between 0.6 and 1 9. Users also indicated that they had to judge many entries as “unclear risk of bias”. Although most users did not expect this finding, it is not altogether surprising [21, 22], as a recent survey of 271 animal studies revealed that reporting experimental details on animals, methods and materials is very poor . We hope and expect, therefore, that use of this tool will improve the reporting quality of essential experimental details in animal studies [23, 24].
Widespread adoption and implementation of this tool will facilitate and improve critical appraisal of evidence from animal studies. This may subsequently enhance the efficiency of translating animal research results into clinical practice. Furthermore, this tool should be tested by authors of SRs of animal intervention studies to test its applicability and validity in practice. We invite users of SYRCLEs RoB tool, therefore, to provide comments and feedback via the SYRCLE LinkedIn group (risk of bias subgroup) http://www.linkedin.com/groups?gid=4301693&trk=hb_side_g. As with the QUADAS, CONSORT and PRISMA statements [15, 16, 25, 26], we expect that user feedback and developments in this relatively new field of evidence-based animal experimentation will allow us to update this tool within a few years.
Sandercock P, Roberts I: Systematic reviews of animal experiments. Lancet. 2002, 360 (9333): 586-10.1016/S0140-6736(02)09812-4.
Hooijmans CR, Rovers M, de Vries RB, Leenaars M, Ritskes-Hoitinga M: An initiative to facilitate well-informed decision-making in laboratory animal research: report of the First International Symposium on Systematic Reviews in Laboratory Animal Science. Lab Anim. 2012, 46 (4): 356-357. 10.1258/la.2012.012052.
Macleod MR, Fisher M, O’Collins V, Sena ES, Dirnagl U, Bath PM, Buchan A, van der Worp HB, Traystman R, Minematsu K, Donnan GA, Howells DW: Good laboratory practice: preventing introduction of bias at the bench. Stroke. 2009, 40 (3): e50-e52. 10.1161/STROKEAHA.108.525386.
Moher D, Cook DJ, Jadad AR, Tugwell P, Moher M, Jones A, Pham B, Klassen TP: Assessing the quality of reports of randomised trials: implications for the conduct of meta-analyses. Health Technol Assess. 1999, 3 (12): 1-4. 1–98
Schulz KF, Chalmers I, Hayes RJ, Altman DG: Empirical evidence of bias. Dimensions of methodological quality associated with estimates of treatment effects in controlled trials. JAMA. 1995, 273 (5): 408-412. 10.1001/jama.1995.03520290060030.
Hooijmans CR, Pasker-de Jong PC, de Vries RB, Ritskes-Hoitinga M: The effects of long-term omega-3 fatty acid supplementation on cognition and Alzheimer’s pathology in animal models of Alzheimer’s disease: a systematic review and meta-analysis. J Alzheimers Dis. 2012, 28 (1): 191-209.
Wever KE, Menting TP, Rovers M, van der Vliet JA, Rongen GA, Masereeuw R, Ritskes-Hoitinga M, Hooijmans CR, Warle M: Ischemic preconditioning in the animal kidney, a systematic review and meta-analysis. PLoS One. 2012, 7 (2): e32296-10.1371/journal.pone.0032296.
Thayer K, Rooney A, Boyles A, Holmgren S, Walker V, Kissling G: Draft protocol for systematic review to evaluate the evidence for an association between bisphenol A (BPA) exposure and obesity. National Toxicology Program. 2013, U.S. Department of health and human services
Whiting P, Rutjes AW, Reitsma JB, Bossuyt PM, Kleijnen J: The development of QUADAS: a tool for the quality assessment of studies of diagnostic accuracy included in systematic reviews. BMC Med Res Methodol. 2003, 3: 25-10.1186/1471-2288-3-25.
Kilkenny C, Parsons N, Kadyszewski E, Festing MF, Cuthill IC, Fry D, Hutton J, Altman DG: Survey of the quality of experimental design, statistical analysis and reporting of research using animals. PLoS One. 2009, 4 (11): e7824-10.1371/journal.pone.0007824.
Beynen ACG,K, van Zutphen LFM: Standardization of the animal and itsenvironment. In Principles of Laboratory Animal Science, Revised Edition.Edited by van Zutphen LFMB V, Beynen AC. Amsterdam and New York:Elsevier B.V.; 2001
Perel P, Roberts I, Sena E, Wheble P, Briscoe C, Sandercock P, Macleod M, Mignini LE, Jayaram P, Khan KS: Comparison of treatment effects between animal experiments and clinical trials: systematic review. BMJ. 2007, 334 (7586): 197-10.1136/bmj.39048.407928.BE.
Roberts I, Kwan I, Evans P, Haig S: Does animal experimentation inform human healthcare? Observations from a systematic review of international animal experiments on fluid resuscitation. BMJ. 2002, 324 (7335): 474-476. 10.1136/bmj.324.7335.474.
Faggion CM, Giannakopoulos NN, Listl S: Risk of bias of animal studies on regenerative procedures for periodontal and peri-implant bone defects - a systematic review. J Clin Periodontol. 2011, 38 (12): 1154-1160. 10.1111/j.1600-051X.2011.01783.x.
Hooijmans CR, de Vries RB, Rovers MM, Gooszen HG, Ritskes-Hoitinga M: The effects of probiotic supplementation on experimental acute pancreatitis: a systematic review and meta-analysis. PLoS One. 2012, 7 (11): e48811-10.1371/journal.pone.0048811.
Hooijmans CR, Leenaars M, Ritskes-Hoitinga M: A gold standard publication checklist to improve the quality of animal studies, to fully integrate the Three Rs, and to make systematic reviews more feasible. Altern Lab Anim. 2010, 38 (2): 167-182.
Begg C, Cho M, Eastwood S, Horton R, Moher D, Olkin I, Pitkin R, Rennie D, Schulz KF, Simel D, Stroup DF: Improving the quality of reporting of randomized controlled trials. The CONSORT statement. JAMA. 1996, 276 (8): 637-639. 10.1001/jama.1996.03540080059030.
Clough G: Environmental factors in relation to the comfort and well-being of laboratory rats and mice. Standards in Laboratory Animal Management. 1984, Wheathampstead: Universities Federation for Animal Welfare (UFAW), 1: 7-24.
Bruguerolle B, Valli M, Bouyard L, Jadot G, Bouyard P: Effect of the hour of administration on the pharmacokinetics of lidocaine in the rat. Eur J Drug Metab Pharmacokinet. 1983, 8 (3): 233-238. 10.1007/BF03188753.
Marrino P, Gavish D, Shafrir E, Eisenberg S: Diurnal-variations of plasma-lipids, tissue and plasma-lipoprotein lipase, and VLDL secretion rates in the rat - a model for studies of VLDL metabolism. Biochim Biophys Acta. 1987, 920 (3): 277-284. 10.1016/0005-2760(87)90105-6.
The development of SYRCLE’s RoB tool was partly funded by the Ministry of Health, Welfare and Sport of the government of the Netherlands (grant nr: 321200). The views expressed in this article are those of the authors and not necessarily those of the funder.
Authors and Affiliations
SYRCLE at Central Animal Laboratory, Radboud University Medical Center, Nijmegen, The Netherlands
Carlijn R Hooijmans, Rob BM de Vries, Marlies Leenaars & Merel Ritskes-Hoitinga
Centre of Evidence-based Surgery, Radboud University Medical Center, Nijmegen, The Netherlands
Maroeska M Rovers
Dutch Cochrane Centre, Academic Medical Center, University of Amsterdam, Amsterdam, The Netherlands
All authors declare that none of the authors (Hooijmans, Rovers, de Vries, Leenaars, Ritskes-Hoitinga, Langendam) have anything to disclose or any competing interests. The authors had no support from any organization for the submitted work; no financial relationships with any organizations that might have an interest in the submitted work in the previous three years, no other relationships or activities that could appear to have influenced the submitted work.
CRH coordinated the project. CRH, MWL and MMR have made substantial contributions to the design of the RoB tool, article conception and wrote the first draft of the paper. MRH, ML and RdV provided advice on bias in animal studies (part of the discussion group). MRH and RdV revised the manuscript. All authors read and approved the final manuscript.
Additional file 1: A pilot survey to provide some supportive information for some of the statements made in Table 1.(DOCX 48 KB)
Rights and permissions
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.