Quality assessment tools used in systematic reviews of in vitro studies: A systematic review
BMC Medical Research Methodology volume 21, Article number: 101 (2021)
Systematic reviews (SRs) and meta-analyses (MAs) are commonly conducted to evaluate and summarize medical literature. This is especially useful in assessing in vitro studies for consistency. Our study aims to systematically review all available quality assessment (QA) tools employed on in vitro SRs/MAs.
A search on four databases, including PubMed, Scopus, Virtual Health Library and Web of Science, was conducted from 2006 to 2020. The available SRs/MAs of in vitro studies were evaluated. DARE tool was applied to assess the risk of bias of included articles. Our protocol was developed and uploaded to ResearchGate in June 2016.
Our findings reported an increasing trend in publication of in vitro SRs/MAs from 2007 to 2020. Among the 244 included SRs/MAs, 126 articles (51.6%) had conducted the QA procedure. Overall, 51 QA tools were identified; 26 of them (51%) were developed by the authors specifically, whereas 25 (49%) were pre-constructed tools. SRs/MAs in dentistry frequently had their own QA tool developed by the authors, while SRs/MAs in other topics applied various QA tools. Many pre-structured tools in these in vitro SRs/MAs were modified from QA tools of in vivo or clinical trials, therefore, they had various criteria.
Many different QA tools currently exist in the literature; however, none cover all critical aspects of in vitro SRs/MAs. There is a need for a comprehensive guideline to ensure the quality of SR/MA due to their precise nature.
Evidence-based medicine (EBM) is a reliable and accurate approach based on existing evidence in healthcare-related researches . Systematic reviews (SRs) and meta-analyses (MAs) are crucial methods of EBM that assess the findings of different work in the medical literature on related topics. The data and conclusions of each work synthesized to present a comprehensive summary and conclusion based on the findings [2, 3]. The primary medical value behind conducting such studies is to improve healthcare delivery and outcomes in the clinical setting. Researchers can utilize these tools to summarize clinical research with a non-biased approach to even the most controversial topics [4, 5]. For in vitro studies, being able to translate and keep track of numerous research projects that address the same topic increases transparency and addresses the significance of clinical translation. They eventually enhance the safety and efficacy of the treatments in clinical practice [6, 7]. Current discrepancy between SRs/MAs on preclinical studies and SRs/MAs on clinical studies suggest a potential gap in the assessment and evaluation of preclinical evidence. This may lead to inadequate translation to clinical evidence [6, 8]. Therefore, further research is justified to address any possible shortfalls in the methodology of performing such study types.
The precise nature of scientific discoveries combined with the increasing influx of research papers highlight the importance of QA tool for all published articles. The development of QA tools investigates to improve the quality of scientific reports by addressing unethical and misconducted research studies [9,10,11]. There are several methods to assess the quality of SRs/MAs, including Assessment of Multiple Systematic Reviews (AMSTAR) and Critical Appraisal Skills Programme (CASP). The majority of SRs/MAs uses one of many available approaches. However, the content and weights of different tools are variable and inconsistent, which raises the question on either an universally accepted one should be developed. ToxRTool and Oral Health Assessment Tool (OHAT) are two examples of the National Health and Medical Research Council’s recommendation for SRs/MAs of in vitro studies . Both tools cover different aspects of risk of bias, providing researchers with a partial guide while conducting or assessing these studies.
Given the importance of in vitro SRs/MAs to the advancement of research through the comprehensive implementation, its results support the experimental and clinical settings. Further investigations are required to bridge any potential inadequacies in the methods of synthesizing preclinical evidence. The selection of high-quality QA tools for SRs/MAs on preclinical studies can improve research quality by significantly addressing its methodology. Therefore, our study aimed to evaluate all QA tools used in SRs/MAs of in vitro studies.
Our study followed the steps recommended in the Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) checklist (Table S1) [13, 14]. The protocol was published in ResearchGate in June 2016 (DOI: https://doi.org/10.13140/RG.2.1.1515.9925). These works relate to in vitro studies; they could not be published on the International Prospective Register of Systematic Reviews (PROSPERO).
Our conducted search contained two phases. The first phase was performed to identify the SRs/MAs of in vitro studies in October 2016 using four electronic databases, composed of PubMed, Web of Science (ISI), Virtual Health Library (VHL), and Scopus. This search would be later updated in September 2020. The entire search strategy for these databases was provided in Table S2. No restriction was applied inside the publication date range from 2006 to 2020. This procedure was also based on previously performed search .
The second phase was conducted to identify potential tools using the Google search engine (www.google.com) . The automatic Google filter was switched off, and the first 300 links of each search term were screened for relevant tools. In addition, the bibliographic references of the first 300 links of Google search and eventually included their respective publications were searched to find additional tools unidentified by the search of the database. The search strategy was described in Table S2. We used this search model as described before in Nolger et al. . Also, several webpages were used to search and identify the relevant QA tools (Table S2).
The inclusion criteria of our studies were; 1- The SRs and/or MAs must be purely in vitro research, 2- Search was ranged from 2006 to 2020 for publication year, 3- Original in vitro study was defined as a technique that was conducted in a controlled condition outside the living organism without being implanted again into the living body or organism. The exclusion criteria were; 1- All SRs and/or MAs that involved in vivo studies, 2- Combined in vivo and in vitro studies. After duplicate removal using Endnote X7 program (Thompson Reuter, USA), the titles and abstracts were screened by two reviewers (DNHT and TD) independently, followed by the inclusion and exclusion criteria. Afterward, full-texts of the selected articles were divided into several clusters, and each one was evaluated by another two reviewers (TD and AE) working independently. Results were then gathered, and in the case of inconsistency, a final decision was resolved following discussion with the supervisor (NTH). We included all QA tools that were used in the included articles. For the second phase, all potential QA tools were included if addressed or proposed in any in vitro studies.
Two independent reviewers (DNHT and TD) extracted the data from included articles into a specifically designed template using the same method in the screening phase. The removed items including the name of authors, year and region of publication, the involvement of a methodologist or statistician, whether a meta-analysis was conducted, and whether the risk of bias assessment was undertaken. To extract the QA tool used in each included study, we followed the name of this tool and searched for its original paper for the subsequent extraction of QA tools. The QA tools included in our study were all tools which the authors applied for included articles in their respective SRs/MAs.
In each confirmed relevant tool, we collected the following components: type of the tool (scale, checklist, or item), number of items and main contents of its tool, the scoring system, description of formulation, whether the tool was developed for generic purpose in SRs/MAs, single-use in a specific SR/MA (in a particular type of in vitro studies), and whether the tools were developed by the authors themselves or pre-structured tools. Reviewers resolved any dissimilarity via discussion. If a decision could not be achieved, the supervisor (NTH) was consulted to reach a consensus.
Quality assessment (QA)
The QA on the included publications was carried out utilizing the Database of Abstract of Reviews of Effects (DARE) tool . Five criteria consist of: (i) was inclusion/exclusion criteria reported; (ii) was the search adequate; (iii) was the quality of the included studies assessed; (iv) are sufficient details about the individual included studies present and (v) were the included studies synthesized. The interpretation for fulfilling a “yes,” “partial,” and “no” score was described in Figure S1. The DARE tool has been used in a tertiary study to evaluate the included SRs/MAs . Two independent reviewers (LT and TD) performed the QA process, and any dispute was fixed via discussion. If a decision could not be obtained, supervisor (NTH) consulted to reach the consensus. Further analysis was calculated using a Kappa coefficient to determine the inter-agreement between the examiners in each process.
The first phase retrieved 11,757 initial reports from four electronic databases, including PubMed, ISI, Scopus, and VHL, as shown in Fig. 1. After duplicate removal, the titles and abstracts of 11,640 publications were screened. From this, only 343 studies were included for full-text screening, with 244 articles reaching final eligibility. The list of 99 excluded reports with exclusion reasoning was provided in Table S3. The publication date of all included SRs/MAs ranged from 2007 to 2020. The second phase search retrieved 3000 links in the Google search engine. We screened the first 300 links for each one of ten search terms used. We included 32 links for full-text screening, of which 29 were deemed irrelevant links and were excluded. This left three tools eligible for inclusion in phase two in our study.
Characteristics of included in vitro SRs/MAs
Among 244 in vitro SRs/MAs included in the analysis, 150 articles (60.7%) employed the guidelines for SRs/MAs. Of these, 146 articles used the PRISMA checklist though only a single study followed Quality of Reporting of Meta-analyses (QUOROM) checklist, and only one study followed Strengthening the Reporting of Observational Studies in Epidemiology (STROBE), and Oral Health Assessment Tool (OHAT) (Table 1). Only 100 articles that followed guidelines for SR/MA reported their QA results. The list of 244 included articles using QA tools in vitro SRs/MAs was found in Table S4.
Among 244 included studies, 126 articles (51.6%) performed QA. Only 26 of 126 articles developed their QA tools while conducting their reviews, meanwhile 100 articles employed the available tools. Also, 34 studies followed the QA checklist, which was previously developed by other authors. The others assessed the risks of bias following pre-structured QA tools.
Regarding the distribution of the included studies based on the continent, Europe had the most significant representation with 99 (40.7%) studies meanwhile 65 (26.6%), 14 (5.7%), 27 (11.1%), 31 (12.7%) were from South-America, North America, Middle East, and Asia, respectively. Three studies (1.2%) were from Africa, and five studies from (2%) Australia. Table S3 provided the characteristics of all included SRs/MAs.
The publication trend of in vitro SRs/MAs slowly enhanced from 2007 to 2014 and then rapidly increased in the following years until 2020. There were 126 of 244 included articles (51.6%) that conducted methodological QA. Although no SR/MA assessed QA in 2007 and 2008, the prevalent studies performing QA among included in vitro SRs/MAs steadily increased during the search period.
QA results of included studies using the DARE tool
While utilizing the DARE tool to evaluate 244 included studies, five criteria were presented in Table 2 along with the Kappa’s index and the level of agreement. The inclusion and exclusion criteria have been reported in 220 of included studies (90.2%), while 22 studies were evaluated with partially reporting (9%), and two papers did not report this criterion (0.8%). Search coverage was written in 122 of the included studies (50%), while 115 studies reported partially (47.1%), and seven studies did not report that criterion (2.9%). Among 244 included studies, tools of QA were reported in 126 studies (51.6%); meanwhile, three articles partially performed QA (1.2%), and 115 studies did not assess the QA of their studies (47.2%). Study description and study synthesis criteria have been evaluated, and the level of agreement was critical for both. The QA result of five DARE assessment criteria was provided in Fig. 2 and Table S4.
Summary of QA tools
We identified 51 different available QA tools. Of these, 48 tools from the first phase were retrieved from within included studies and three tools from the second phase found by Google engine, including IVD (in vitro diagnosis), artificial rumen system, and OHAT. We found that 26 used tools (51%) in the first phases developed by the authors [20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45], while other 22 tools were pre-structured and included 19 studies from the first phases and three tools from the second phase accounted for the remaining 49%. Among 26 QA tools developed by the authors, 20 tools (76.9%), specialized in dentistry studies whereas two tools (7.7%) applied in the methodology, two tools (7.7%) applied in bioactivity studies, and two tools (7.7%) involved in the biology studies. Among tools developed by the authors, 17 tools (65.38%) [20, 21, 23, 26,27,28,29,30,31,32,33,34, 39, 40, 42, 44, 45] mentioned items, which could be used only in specific fields (mainly on dentistry) while nine tools (34.62%) [22, 24, 25, 35,36,37,38, 41, 43] contributed the criteria for general reviews, as shown in Table 3. Tools used for a specific study often contained unique factors directly relating to the test materials and outcomes in the reviews. Examples of this include teeth free of caries, the specimen preparation, specimen dimension, enamel antagonist, the specimen shape, concentration of enzymes, storage condition of the sample, or the used devices. The authors also highly concern on the bias of method, which could affect the reliability of outcomes, namely calculating sample size, the randomization of samples, the blinding of the examiner, and the appropriate form of statistical analysis. Instead, tools used for general SRs/MAs evaluated the reliability of methodology to report results generally  or consisted of items assessing each step of study (objective, sequence generation, blinding, selection bias, detection bias, performance bias, report bias). The majority of these tools (11 tools, 42.3%) were contributed as simple checklists. These tools only had questions and required the answers of “yes”, “no” or “not report.” The overall bias could be decided by the number of “yes” or “no” answers. Seven checklists with judgment (26.9%) among tools developed by the authors contained multiple items, which required the authors to provide their assessment in details and compared them between studies. Finally, eight scale tools (30.8%) rated the quality of each item with varied levels by giving points to them; for instance, reported answer = 1 point, not reported answer = 0 points. There were also tool in quality ratings of each domain with different levels (0–4 points). The summary score of each study determined as high, low or unclear risk of bias correspondingly.
In contrast, for 25 pre-structured tools, there were approximately 20 tools (80%) used for general SRs/MAs. The exceptions were QA for IVD , a tool for in vitro studies using artificial rumen , which specialized in studies on cell lines, and two tools for the evaluation of toxicological/ecotoxicological data [48, 49]. In general, there were four simple checklists, six checklists with judgment with 15 scales respectively (Table 3). IVD and artificial rumen tools are checklists with conclusions. The assessments entirely required the examiners to give their checking based on available criteria. Tool for IVDs suggested validations relating to their technical characteristics, namely technical specification actable for registry purposes, their format for technical file, manufacturers, their proper distribution, and cost-effectiveness.
Similarly, the validation established for experiments with artificial rumen focused on specific criteria via the assessment of microorganisms, dividing protozoa, incubation periods, the digestion, and the interaction between chemicals used. Meanwhile, the tool specializing in cell lines (World Cancer Research Fund, University of Bristol) also highlighted the cell line characteristics, repetitive numbers of experiments, and the reporting selection of outcomes. The first tool in toxicological/ecotoxicological information was developed by Klimisch et al. . The criteria entirely focused on factors affecting the results, namely the test substances (their purity/origin/composition, their concentration/doses) or test systems (their suitability, the physical and chemical characteristics of the medium, negative/positive controls) and method to measure the results (appropriate statistic method). These authors suggested four levels of quality, including reliable with and without restriction, not reliable and not assignable, accordingly. However, this approach did not have specific guidance for the quality evaluation. In 2009, Schneider et al.  developed a more detailed tool named ToxRTool based on Klimisch et al.  ‘s suggestion to address this flaw. The ToxRTool for in vitro SRs/MAs included 18 questions evaluating the test substances, test system, study design description, study results documentation, and plausibility of study design and data. For each criterion reported, the study gets one point. The summary score will initially determine its level of quality. However, Schneider et al.  indicated some critical criteria would downgrade the overall level if the study did not report it. The evaluators will give their decision after considering both the summary score and the answer to critical questions.
For 20 pre-structured tools for general reviews, they emphasized the bias based on the detection or selection of samples, the balance of baseline characteristics, the complete outcome reported, and the sequence generation. Two of these tools (EBM Evidence Pyramid and GRADE tool) were wrongly used as assessment tools of methodological quality or risk of bias. Mainly, Xiao et al.  used EBM Evidence Pyramid to evaluate the methodological quality, while Pavan et al.  used GRADE tool to assess the risk of bias of their included studies. However, we still had them as exceptional cases of QA tools applied by other authors of SRs/MAs in our research. Among these 20 pre-structured tools, the QA tool referring to CRH and the EBM Evidence Pyramid  might be classified as the most straightforward checklist. This tool has four levels and defines the grades of quality based on the study design (SRs/MAs of in vitro studies = A) and baseline characteristics (comparable baseline = B, unknown baseline = C, no similar baseline = D). However, it is inappropriate to evaluate the methodology of a SR/MA only based on the baseline characteristics. The GRADE tool  is a tool to grade the quality of evidence (strong to low quality), which consists of six other domains (study design, inconsistency, indirectness, limitations, imprecision, and publication bias) to adjust (downward or upward) this initial assessment of quality. Therefore, the GRADE tool instructs the authors on defining the critical outcomes and evaluating the quality of such results rather than assessing the study’s risk of bias.
For the remained 18 tools, although there were both checklists with available questions needing yes/no answers and lists with domains needs requiring assessor’s opinions, these questions are divided into these domains: rationale of the study, samples, randomization, blinding, procedures, reported outcomes, discussion evaluation and other bias (Table 4). The criteria were highly varied. The most popular criterion, which needs to be considered as the appropriate analysis, was mentioned in Cochrane Collaboration , Joanna Briggs Institute Clinical Appraisal Checklist for Experimental Studies , Timmer’s Analysis Tool , and OHAT . Other pricipal criteria are description of data collection, the blinding of samples and investigators/assessors, appropriate method, reporting of all outcomes mentioned in the method, and reporting of missing data. These criteria were mentioned by three tools in Table 4. The less highlighted criteria include the reasonable sample size, the appropriate method of data collection, the representative samples, the balanced baseline characteristic between intervention groups, the detailed sample data, the randomization of allocation sequence, the assurance that samples received the proper procedure, the appropriate control/reference standards, and the adjusted confounders. Finally, the criteria rated by only one tool are the rationale of the study, the description of the sample collection tool, the description of controls/reference standards, the adequate randomization, the blinding of allocation sequence, the full description of procedures, the identical approach between groups, the description of control/reference standard, the replication, the justification of method analysis, the similar research between groups, the report of complete data, no selection of reported results, report of intermediate results and the requirement of the reflection in a clinical trial.
The publication trend of in vitro SRs/MAs has been recently increasing . This was demonstrated by the number of SRs/MAs found in our study along with the recent surge of novel QA tools for in vitro studies [48, 53]. QA highly plays a critical role in every SR/MA to judge the methodology and reliability, reduce the risk of bias, and strengthen the evidence and recommendations taken from such reviews . However, the percentage of papers, which reported QA procedures in our SRs/MAs, was associated with an increasing trend to 42% in 2016. This was relatively low compared to other areas, such as in randomized clinical trials, the Cochrane tool was reported as widely used in 100% Cochrane reviews .
In our study, a total of 51 tools has been identified from two phases, in which, 48 tools were used by authors of the included studies and three tools were found via Google engine. There were 26 articles, which used in the authors’ methodological assessment, and other tools were pre-structured. Almost every study used a different QA tool, which imposed several challenges that might restrict the process of consistent, reliable, and integral appraisal of SRs/MAs. Most QA tools in SRs/MAs of dentistry topic were developed by the authors. These tools were mainly methodological QA that primarily focused on materials and standard procedures in dentistry. Also, the majority of dentistry SRs/MAs followed by the criteria previously proposed by Onofre et al. This implied that the criteria for assessing the methodology in dental procedures were standards and could be applied widely in different SRs/MAs. Only few dental SRs/MAs followed QA process as the instructed in pre-structured tools such as Cochrane or CASP and MINORS [65,66,67,68,69].
Regarding SRs/MAs relating to toxicology and diagnosis, there was a consistent manner among applied tools since these QA tools specialized in each specific topic. For instance, ToxRTool was applied in toxicological studies and QUADAS-2 used in diagnosis studies. We also recommend these tools in SRs/MAs of these particular topics because the criteria mainly cover the essential aspects from the included studies. Other SRs/MAs of other subjects (biology, bioactivity, and materials) assessed the quality of included studies primarily by pre-structured tools. However, they considered a variety of selected QA tools used. Both methodological and reporting QA tools were applied. The authors had to modify these tools for their in vitro SRs/MAs since these tools were initially applied for in vivo studies (SYCLE tool) or clinical trials (MINORS, CONSORT checklist, Cochrane risk of bias tool). This resulted in several inappropriate criteria, which were only applicable in clinical trials or in vivo studies, which were included in assessing quality. Only few SRs/MAs developed their own criteria for these assessments [23, 70, 71]. As a result, this variety of QA tools poses a difficulty for researchers to select which tool is more suitable to apply. Also, the inconsistency of items covered by each tool will affect the review process, especially when a specific tool dismisses items that reflect the essential informations on the study under-assessment. This mostly occurred in methodological QA tools since no current tool covers all methodological aspects for all topics. Therefore, this emphasizes the importance of reaching a consensus among researchers regarding QA tools for in vitro SRs/MAs comprehensively. Possibly there were several inappropriate criteria included in a tool and lacking of several critical criteria, there is a need for consensus of essential aspects that should be included in QA tool of in vitro SRs/MAs as well as the removal of unnecessary criteria.
Given the lack of a standard QA tool for in vitro studies, several authors tend to develop a checklist that suits the needs of their project. Passos et al. , Altmann et al. , and Goldbach et al.  have developed different sets of QA tools that are either generic or serve specific purposes. Sarkis-Onofre et al. [26, 72,73,74,75,76,77] also developed several scales for single use in a particular context. This tool was used by other in vitro studies of the same topic in our sample and consisted of seven items. The most commonly used items among seven studies including sample size calculation and randomization of ceramic specimen/teeth, indicated a specific field of in vitro research. A highlight of tool developed by authors in conducting their study is their technical characteristics. These assessment help the authors evaluate the exact level of quality of included studies by relevant factors that affect the reliability of these studies (including the used devices, the appropriate medium, or specimens). However, the tools missed non-technical factors causing bias, such as the integrity of data reported or the appropriate analysis method. Three other tools [22, 24, 25], which were developed by authors but covered different aspects of a study, appeared to comprehensively evaluate their included studies ranging from report/performance/selection/detection criteria. However, the guidance for using these tools was not clarified in these reviews, as they only rated these criteria based on their own research question leading to the limitations if other reviews aimed to use their approaches. The specificity of these tools limits their use on a broader scale, and the variant nature facilitates a researcher to disregard specific tools in literature searching.
ToxRTool, which was known as criteria for reporting and evaluating ecotoxicity data, and other standardized tools were developed to provide more detailed and transparent evaluation systems [78,79,80]. ToxRTool has become widely used in toxicological research, and there are similar tools adapted from it . However, the consistency of ToxRTool has some limitations and requires some refinements . In fact, ToxRTool aimed to evaluate the toxicological data. Therefore, there are many questions relating to the test substance and test systems that might not be important for other in vitro studies, for instance, the pureness and source of the test substance.
Moreover, the lack of positive control would reduce points of the quality of a study. However, not all in vitro studies have their positive control. This decision depends on the research questions and the purposes of the study. Several other crucial factors for general in vitro studies were neglected in this tool, such as the description of sample collection or the suitable sample size. For these reasons, ToxRTool might be the best choice for toxicological studies. Nevertheless, if we used it for reviews of in vitro studies in other fields, it could not assess the appropriate levels of quality for included studies.
Amongst pre-structured tools for general reviews included in our study, EBM Evidence Pyramid and GRADE tool were wrongly applied to assess the methodology and risk of bias. We do not recommend other authors to use these tools in their SRs/MAs in the future. On the other hand, it seems that no tool covers almost the essential criteria, such as sample size, the procedure of sample collection, negative/positive controls, randomization, blinding, analysis, data complete, and missing data. For instance, the justification of sample size, the full description of experimental procedures, and the appropriate positive/negative controls were only included for assessment in limited tools.
Similarly, the report of missing data and the completion of data report are also important since they relate to the reporting bias leading to the inaccurate reflection of in vitro results in clinical trials. But these criteria are only mentioned in one tool. In contrast, for in vitro studies, the blinding of the sample (participant) is unnecessary since this kind of sample could not cause report bias. However, this criterion is requested to report in many tools due to the modification of QA tools of in vivo studies or clinical trials. This gap leads to the fact that whether the authors use any tools for their reviews, the assessment might not reflect the proper level of quality of included studies. This depends on authors’ perspective to determine which tool is more relevant to their research questions. Frequently, the authors should combine several tools to get the most suitable one for their study. But this could not be applied widely if no detailed guidance is developed. Three tools mentioned the randomization, in which, Cochrane Collaboration and Timmer’s Analysis Tool suggested the randomization of allocation sequence meanwhile OHAT recomended assessing whether the randomization was adequate. This is a challenge for in vitro reviews since the randomization is difficult to apply for in vitro materials. Unfortunately, no in vitro review in our study clarified how they determined the randomization in their included studies. We do not deny the critical role of randomization to reduce the selection bias and reduce the wrong results caused by different characteristics of samples. However, we suggest that additional QA tools for in vitro studies should focus on criteria to assure the identical attributes in studied samples. This finding is easier than the contribution to the method of randomization. Finally, we agree that a QA tool for in vitro studies should include the blinding of investigators and assessors. This can reduce the bias caused by the ability to predict the results of the researchers.
Our study had certain limitations. Despite low probability, we applied the restrictions to our search during 14 years, excluding in vivo studies and combined studies. In addition, 300 links screening from Google search engine might result in a missed tool or guideline. Also, we included all tools applied by the authors in their SRs/MAs that led to forming some inappropriate QA tools (GRADE tool and EBM Evidence Pyramid). However, we discussed this problem and did not recommend it in other SRs/MAs of in vitro studies.
Multiple different QA tools are currently available throughout the literature. However, none could cover all critical aspects of in vitro SRs/MAs. Thus, a comprehensive guide should be developed to addresses all significant concerns and aspects of this field. This would have the possibility to increase the transparency and reproducibility of scientific work, boosting the reliability and validity of available in vitro findings. This study serves as an initial step towards achieving these targets by summarizing the QA tools that are readily utilized throughout the literature while pointing out potential improvements to be adopted in the future.
Availability of data and materials
The corresponding authors will provide the datasets in this study by reasonable request.
Assessment of Multiple Systematic Reviews
Critical Appraisal Skills Programme
Center for Reviews and Dissemination
Database of Abstract of Reviews of Effects
In vitro diagnosis
Joanna Briggs Institute
National Institute for Clinical Excellence
Preferred reporting items for systematic reviews and meta-analyses
Quality of Reporting of Meta-analyses
Scottish Intercollegiate Guidelines Network
Virtual Health Library
Manchikanti L. Evidence-based medicine, systematic reviews, and guidelines in interventional pain management, part I: introduction and general considerations. Pain Physician. 2008;11(2):161–86.
Lau J, Ioannidis JP, Schmid CH. Summing up evidence: one answer is not always enough. Lancet. 1998;351(9096):123–7. https://doi.org/10.1016/S0140-6736(97)08468-7.
Oxman AD, Schnemann HJ, Fretheim A. Improving the use of research evidence in guideline development: 8. Synthesis and presentation of evidence. Health Res Policy Syst. 2006;4(20). https://doi.org/10.1186/1478-4505-4-20.
Swennen MH, van der Heijden GJ, Boeije HR, van Rheenen N, Verheul FJ, van der Graaf Y, et al. Doctors' perceptions and use of evidence-based medicine: a systematic review and thematic synthesis of qualitative studies. Acad Med. 2013;88(9):1384–96. https://doi.org/10.1097/ACM.0b013e31829ed3cc.
Gallagher EJ. Systematic reviews: a logical methodological extension of evidence-based medicine. Acad Emerg Med. 1999;6(12):1255–60. https://doi.org/10.1111/j.1553-2712.1999.tb00142.x.
Ritskes-Hoitinga M, Leenaars M, Avey M, Rovers M, Scholten R. Systematic reviews of preclinical animal studies can make significant contributions to health care and more transparent translational medicine. Cochrane Database Syst Rev. 2014;3:ED000078.
Sena ES, Currie GL, McCann SK, Macleod MR, Howells DW. Systematic reviews and meta-analysis of preclinical studies: why perform them and how to appraise them critically. J Cereb Blood Flow Metab. 2014;34(5):737–42. https://doi.org/10.1038/jcbfm.2014.28.
Howells DW, Sena ES, Macleod MR. Bringing rigour to translational medicine. Nat Rev Neurol. 2014;10(1):37–43. https://doi.org/10.1038/nrneurol.2013.232.
Brouwers MC, Johnston ME, Charette ML, Hanna SE, Jadad AR, Browman GP. Evaluating the role of quality assessment of primary studies in systematic reviews of cancer practice guidelines. BMC Med Res Methodol. 2005;5(1):8. https://doi.org/10.1186/1471-2288-5-8.
Shamliyan T, Kane RL, Jansen S. Quality of systematic reviews of observational nontherapeutic studies. Prev Chronic Dis. 2010;7(6): A133. PMID: 20950540; PMCID: PMC2995597.
Wong WC, Cheung CS, Hart GJ. Development of a quality assessment tool for systematic reviews of observational studies (QATSO) of HIV prevalence in men having sex with men and associated risk behaviours. Emerg Themes Epidemiol 2008;5(23). https://doi.org/10.1186/1742-7622-5-23.
National Health and Medical Research Council. Assessing risk of bias. Available from: https://www.nhmrc.gov.au/guidelinesforguidelines/develop/assessing-risk-bias. Accessed 20 Aug 2020.
Liberati A, Altman DG, Tetzlaff J, Mulrow C, Gotzsche PC, Ioannidis JP, et al. The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: explanation and elaboration. J Clin Epidemiol. 2009;62(10):e1–34. https://doi.org/10.1016/j.jclinepi.2009.06.006.
Tawfik GM, Dila KAS, Mohamed MYF, Tam DNH, Kien ND, Ahmed AM, et al. A step by step guide for conducting a systematic review and meta-analysis with simulation data. Tropical Med Health. 2019;47(1):46. https://doi.org/10.1186/s41182-019-0165-6.
Elshafay A, Omran ES, Abdelkhalek M, El-Badry MO, Eisa HG, Fala SY, et al. Reporting quality in systematic reviews of in vitro studies: a systematic review. Curr Med Res Opin. 2019;35(9):1631–41. https://doi.org/10.1080/03007995.2019.1607270.
Madelain V, Nguyen TH, Olivo A, de Lamballerie X, Guedj J, Taburet AM, et al. Ebola virus infection: review of the pharmacokinetic and Pharmacodynamic properties of drugs considered for testing in human efficacy trials. Clin Pharmacokinet. 2016;55(8):907–23. https://doi.org/10.1007/s40262-015-0364-1.
Nogler M, Wimmer C, Mayr E, Öfner D. The efficacy of using search Engines in Procuring Information about Orthopaedic Foot and Ankle Problems from the world wide web. Foot Ankle Int. 1999;20(5):322–5. https://doi.org/10.1177/107110079902000511.
Database of Abstracts of Reviews of Effects (DARE): Quality-assessed Reviews. York (UK): Centre for Reviews and Dissemination (UK); 1995 Available from: https://www.ncbi.nlm.nih.gov/books/NBK285222/. Accessed 20 Aug 2020.
Budgen D, Brereton P, Drummond S, Williams N. Reporting systematic reviews: some lessons from a tertiary study. Inf Softw Technol. 2018;95:62–74. https://doi.org/10.1016/j.infsof.2017.10.017.
Passos SP, Torrealba Y, Major P, Linke B, Flores-Mir C, Nychka JA. In vitro wear behavior of zirconia opposing enamel: a systematic review. J Prosthodont. 2014;23(8):593–601. https://doi.org/10.1111/jopr.12167.
Altmann AS, Collares FM, Leitune VC, Samuel SM. The effect of antimicrobial agents on bond strength of orthodontic adhesives: a meta-analysis of in vitro studies. Orthod Craniofac Res. 2016;19(1):1–9. https://doi.org/10.1111/ocr.12100.
Louropoulou A, Slot DE, Van der Weijden F. Influence of mechanical instruments on the biocompatibility of titanium dental implants surfaces: a systematic review. Clin Oral Implants Res. 2015;26(7):841–50. https://doi.org/10.1111/clr.12365.
Arilla FV, Yeung M, Bell K, Rahnemai-Azar AA, Rothrauff BB, Fu FH, et al. Experimental Execution of the Simulated Pivot-Shift Test: A Systematic Review of Techniques. Arthroscopy. 2015;31(12):2445–54 e2.
Ehsani S, Mandich MA, El-Bialy TH, Flores-Mir C. Frictional resistance in self-ligating orthodontic brackets and conventionally ligated brackets. A systematic review. Angle Orthod. 2009;79(3):592–601. https://doi.org/10.2319/060208-288.1.
Golbach LA, Portelli LA, Savelkoul HF, Terwel SR, Kuster N, de Vries RB, et al. Calcium homeostasis and low-frequency magnetic and electric field exposure: A systematic review and meta-analysis of in vitro studies. Environ Int. 2016;92–93:695–706.
Sarkis-Onofre R, Skupien JA, Cenci MS, Moraes RR, Pereira-Cenci T. The role of resin cement on bond strength of glass-fiber posts luted into root canals: a systematic review and meta-analysis of in vitro studies. Oper Dent. 2014;39(1):E31–44. https://doi.org/10.2341/13-070-LIT.
Możyńska JM, Lipski M, Nowicka A. Tooth discoloration induced by different calcium silicate-based cements: A systematic review of in vitro studies. J Endod. 2017;43(10):1593–601. https://doi.org/10.1016/j.joen.2017.04.002.
Reis AF, Vestphal M, Amaral RC, Rodrigues JA, Roulet JF, Roscoe MG. Efficiency of polymerization of bulk-fill composite resins: a systematic review. Braz Oral Res. 2017;31(suppl 1):e59. https://doi.org/10.1590/1807-3107BOR-2017.vol31.0059.
Ferrúa CP, Centeno EG, Rosa LC, Amaral CC, Severo RF, Sarkis-Onofre R, et al. How has dental pulp stem cells isolation been conducted? A scoping review. Braz Oral Res. 2017;31:e87.
Marchionatti AM, Aurélio IL, May LG. Does veneering technique affect the flexural strength or load-to-failure of bilayer Y-TZP? A systematic review and meta-analysis. J Prosthet Dent. 2018;119(6):916–24. https://doi.org/10.1016/j.prosdent.2017.11.013.
Martins FV, Vasques WF, Fonseca EM. Evaluation of the efficiency of fluoride-releasing adhesives for preventing secondary caries in-vitro: a systematic review and meta-analysis. Eur Arch Paediatr Dent. 2019;20(1):1–8. https://doi.org/10.1007/s40368-018-0388-y.
Elkaffas AA, Eltoukhy RI, Elnegoly SA, Mahmoud SH. The effect of preheating resin composites on surface hardness: a systematic review and meta-analysis. Restor Dent Endod. 2019;44(4):e41. https://doi.org/10.5395/rde.2019.44.e41.
Pourhajibagher M, Sodagar A, Bahador A. An in vitro evaluation of the effects of nanoparticles on shear bond strength and antimicrobial properties of orthodontic adhesives: A systematic review and meta-analysis study. Int Orthod. 2020;18(2):203–13. https://doi.org/10.1016/j.ortho.2020.01.011.
AlFawaz YF, Alonaizan FA. Efficacy of phototherapy in the adhesive bonding of different dental posts to root dentin: A systematic review. Photodiagn Photodyn Ther. 2019;27:111–6. https://doi.org/10.1016/j.pdpdt.2019.05.024.
Samiei M, Shirazi S, Azar FP, Fathifar Z, Ghojazadeh M, Alipour M. The Effect of Different Mixing Methods on the Properties of Calcium-enriched Mixture Cement: A Systematic Review of in Vitro Studies. Iran Endod J. 2019;14(4):240–6.
Kuik K, De Ruiter MH, De Lange J, Hoekema A. Fixation methods in sagittal split ramus osteotomy: a systematic review on in vitro biomechanical assessments. Int J Oral Maxillofac Surg. 2019;48(1):56–70. https://doi.org/10.1016/j.ijom.2018.06.013.
Parikh M, Kishan KV, Solanki NP, Parikh M, Savaliya K, Bindu VH, et al. Efficacy of removal of calcium hydroxide medicament from root canals by Endoactivator and Endovac irrigation techniques: A systematic review of in vitro studies. Contemp Clin Dent. 2019;10(1):135–42. https://doi.org/10.4103/ccd.ccd_335_18.
Silveira FM, de Pauli Paglioni M, Marques MM, Santos-Silva AR, Migliorati CA, Arany P, et al. Examining tumor modulating effects of photobiomodulation therapy on head and neck squamous cell carcinomas. Photochem Photobiol Sci. 2019;18(7):1621–37. https://doi.org/10.1039/C9PP00120D.
Ajay R, Suma K, Ali SA. Monomer modifications of Denture Base acrylic resin: A systematic review and meta-analysis. J Pharm Bioallied Sci. 2019;11(Suppl 2):S112–S25. https://doi.org/10.4103/JPBS.JPBS_34_19.
Özcan M, Höhn J, de Araújo GM, Moura DD, Souza R. Influence of testing parameters on the load-bearing capacity of prosthetic materials used for fixed dental prosthesis: A systematic review and meta-analysis. Braz Dental Sci. 2018;21(4):470–90.
Khaledi A, Meskini M. A systematic review of the effects of Satureja khuzestanica Jamzad and Zataria multiflora Boiss against Pseudomonas aeruginosa. Iran J Med Sci. 2020;45(2):83.
Zhao S, Arnold M, Ma S, Abel R, Cobb J, Hansen U, et al. Standardizing compression testing for measuring the stiffness of human bone. Bone Joint Res. 2018;7(8):524–38.
Hindy A, Farahmand F, sadat Tabatabaei F. In vitro biological outcome of laser application for modification or processing of titanium dental implants. Lasers Med Sci. 2017;32(5):1197–206. https://doi.org/10.1007/s10103-017-2217-7.
Wehner C, Lettner S, Moritz A, Andrukhov O, Rausch-Fan X. Effect of bisphosphonate treatment of titanium surfaces on alkaline phosphatase activity in osteoblasts: a systematic review and meta-analysis. BMC Oral Health. 2020;20:1–13.
Tan MC, Chai Z, Sun C, Hu B, Gao X, Chen Y, et al. Comparative evaluation of the vertical fracture resistance of endodontically treated roots filled with Gutta-percha and Resilon: a meta-analysis of in vitro studies. BMC Oral Health. 2018;18(1):107. https://doi.org/10.1186/s12903-018-0571-x.
Mirab Samiee S, Rahnomaye Farzami M, Aliasgharpour M, Rafie M, Entekhabie B, Sabzavie F. An overview of a new approach in evaluation, verification and validity of in vitro diagnosis (IVDs) performance in Iran. Iran J Public Health. 2013;42(1):107–9.
Warner ACI. Criteria for establishing the validity of in vitro studies with rumen micro-organisms in so-called artificial rumen systems. Microbiology. 1956;14:733–48.
Schneider K, Schwarz M, Burkholder I, Kopp-Schneider A, Edler L, Kinsner-Ovaskainen A, et al. "ToxRTool", a new tool to assess the reliability of toxicological data. Toxicol Lett. 2009;189(2):138–44. https://doi.org/10.1016/j.toxlet.2009.05.013.
Klimisch HJ, Andreae M, Tillmann U. A systematic approach for evaluating the quality of experimental toxicological and ecotoxicological data. Regul Toxicol Pharmacol. 1997;25(1):1–5. https://doi.org/10.1006/rtph.1996.1076.
SDM C. EBM Evidence Pyramid. 2001; Available from: http://library.downstate.edu/ebmdos/2100.htm. Accessed 20 Aug 2020.
Guyatt G, Oxman AD, Akl EA, Kunz R, Vist G, Brozek J, et al. GRADE guidelines: 1. Introduction-GRADE evidence profiles and summary of findings tables. J Clin Epidemiol. 2011;64(4):383–94. https://doi.org/10.1016/j.jclinepi.2010.04.026.
Timmer A, Sutherland LR, Hilsden RJ. Development and evaluation of a quality score for abstracts. BMC Med Res Methodol. 2003;3:2. https://doi.org/10.1186/1471-2288-3-2.
Tsouh Fokou PV, Nyarko AK, Appiah-Opong R, Tchokouaha Yamthe LR, Addo P, Asante IK, et al. Ethnopharmacological reports on anti-Buruli ulcer medicinal plants in three west African countries. J Ethnopharmacol. 2015;172:297–311. https://doi.org/10.1016/j.jep.2015.06.024.
J.P.T. Higgins SGE. Cochrane Handbook for Systematic Reviews of Interventions. 1 ed. Chichester, England, Hoboken: Wiley-Blackwell; 2008.
Aminoshariae A, Kulild J. Master apical file size - smaller or larger: a systematic review of microbial reduction. Int Endod J. 2015;48(11):1007–22. https://doi.org/10.1111/iej.12410.
Group DTAW. Available from: http://srdta.cochrane.org. Accessed 20 Aug 2020.
Sirriyeh R, Lawton R, Gardner P, Armitage G. Reviewing studies with diverse designs: the development and evaluation of a new tool. J Eval Clin Pract. 2012;18(4):746–52. https://doi.org/10.1111/j.1365-2753.2011.01662.x.
Xiao Z, Li C, Shan J, Luo L, Feng L, Lu J, et al. Mechanisms of renal cell apoptosis induced by cyclosporine A: a systematic review of in vitro studies. Am J Nephrol. 2011;33(6):558–66. https://doi.org/10.1159/000328584.
Pavan LM, Rêgo DF, Elias ST, De Luca CG, Guerra EN. In vitro anti-tumor effects of statins on head and neck squamous cell carcinoma: A systematic review. PLoS One. 2015;10(6):e0130476. https://doi.org/10.1371/journal.pone.0130476.
Maina S, Misinzo G, Bakari G, Kim HY. Human, Animal and Plant Health Benefits of Glucosinolates and Strategies for Enhanced Bioactivity: A Systematic Review. Molecules. 2020;25(16):3682. https://doi.org/10.3390/molecules25163682.
Nazeam J, Mohammed EZ, Raafat M, Houssein M, Elkafoury A, Hamdy D, et al. Based on principles and insights of COVID-19 epidemiology, genome sequencing, and pathogenesis: retrospective analysis of Sinigrin and Prolixin (RX) (Fluphenazine) provides off-label drug candidates. SLAS Discovery. 2020;25(10):1123–40. https://doi.org/10.1177/2472555220950236.
Hooijmans C, Ritskes-Hoitinga M. Progress in using systematic reviews of animal studies to improve translational research. PLoS Med. 2013;10(7):e1001482. https://doi.org/10.1371/journal.pmed.1001482.
Shea BJ, Hamel C, Wells GA, Bouter LM, Kristjansson E, Grimshaw J, et al. AMSTAR is a reliable and valid measurement tool to assess the methodological quality of systematic reviews. J Clin Epidemiol. 2009;62(10):1013–20. https://doi.org/10.1016/j.jclinepi.2008.10.009.
Jørgensen L, Paludan-Müller AS, Laursen DR, Savović J, Boutron I, Sterne JA, et al. Evaluation of the Cochrane tool for assessing risk of bias in randomized clinical trials: overview of published comments and analysis of user practice in Cochrane and non-Cochrane reviews. Syst Rev. 2016;5(1):80. https://doi.org/10.1186/s13643-016-0259-8.
Baumgartner S, Koletsi D, Verna C, Eliades T. The effect of enamel sandblasting on enhancing bond strength of orthodontic brackets: A systematic review and meta-analysis. J Adhes Dent. 2017;19(6):463–73.
Kulkarni S, Meer M, George R. The effect of photobiomodulation on human dental pulp-derived stem cells: systematic review. Lasers Med Sci. 2020;35(9):1889–97. https://doi.org/10.1007/s10103-020-03071-6.
Dumbryte I, Vebriene J, Linkeviciene L, Malinauskas M. Enamel microcracks in the form of tooth damage during orthodontic debonding: a systematic review and meta-analysis of in vitro studies. Eur J Orthod. 2018;40(6):636–48. https://doi.org/10.1093/ejo/cjx102.
Iliadi A, Koletsi D, Eliades T. Forces and moments generated by aligner-type appliances for orthodontic tooth movement: A systematic review and meta-analysis. Orthod Craniofac Res. 2019;22(4):248–58. https://doi.org/10.1111/ocr.12333.
Mello CC, Lemos CA, de Luna Gomes JM, Verri FR, Pellizzer EP. CAD/CAM vs conventional technique for fabrication of implant-supported frameworks: A systematic review and meta-analysis of in vitro studies. Int J Prosthodont. 2019;32(2):182–92. https://doi.org/10.11607/ijp.5616.
Silveira RG, Ferrúa CP, do Amaral CC, Garcia TF, de Souza KB, Nedel F. MicroRNAs expressed in neuronal differentiation and their associated pathways: Systematic review and bioinformatics analysis. Brain Res Bull. 2020;157:140–8.
García-Sanz V, Paredes-Gallardo V, Mendoza-Yero O, Carbonell-Leal M, Albaladejo A, Montiel-Company JM, et al. The effects of lasers on bond strength to ceramic materials: A systematic review and meta-analysis. PLoS One. 2018;13(1):e0190736. https://doi.org/10.1371/journal.pone.0190736.
Lenzi TL, Gimenez T, Tedesco TK, Mendes FM, Rocha RO, Raggio DP. Adhesive systems for restoring primary teeth: a systematic review and meta-analysis of in vitro studies. Int J Paediatr Dent. 2016;26(5):364–75. https://doi.org/10.1111/ipd.12210.
Rosa WL, Piva E, Silva AF. Bond strength of universal adhesives: A systematic review and meta-analysis. J Dent. 2015;43(7):765–76. https://doi.org/10.1016/j.jdent.2015.04.003.
Moraes AP, Sarkis-Onofre R, Moraes RR, Cenci MS, Soares CJ, Pereira-Cenci T. Can Silanization increase the retention of glass-fiber posts? A systematic review and meta-analysis of in vitro studies. Oper Dent. 2015;40(6):567–80. https://doi.org/10.2341/14-330-O.
Aurélio IL, Marchionatti AM, Montagner AF, May LG, Soares FZ. Does air particle abrasion affect the flexural strength and phase transformation of Y-TZP? A systematic review and meta-analysis. Dent Mater. 2016;32(6):827–45. https://doi.org/10.1016/j.dental.2016.03.021.
AlShwaimi E, Bogari D, Ajaj R, Al-Shahrani S, Almas K, Majeed A. In vitro antimicrobial effectiveness of root canal sealers against enterococcus faecalis: A systematic review. J Endod. 2016;42(11):1588–97. https://doi.org/10.1016/j.joen.2016.08.001.
Pereira GK, Venturini AB, Silvestri T, Dapieve KS, Montagner AF, Soares FZ, et al. Low-temperature degradation of Y-TZP ceramics: A systematic review and meta-analysis. J Mech Behav Biomed Mater. 2015;55:151–63. https://doi.org/10.1016/j.jmbbm.2015.10.017.
Money CD, Tomenson JA, Penman MG, Boogaard PJ, Jeffrey LR. A systematic approach for evaluating and scoring human data. Regul Toxicol Pharmacol. 2013;66(2):241–7. https://doi.org/10.1016/j.yrtph.2013.03.011.
Samuel GO, Hoffmann S, Wright RA, Lalu MM, Patlewicz G, Becker RA, et al. Guidance on assessing the methodological and reporting quality of toxicologically relevant studies: A scoping review. Environ Int. 2016;92–93:630–46.
Lynch HN, Goodman JE, Tabony JA, Rhomberg LR. Systematic comparison of study quality criteria. Regul Toxicol Pharmacol. 2016;76:187–98. https://doi.org/10.1016/j.yrtph.2015.12.017.
Koch MS, DeSesso JM, Williams AL, Michalek S, Hammond B. Adaptation of the ToxRTool to assess the reliability of toxicology studies conducted with genetically modified crops and implications for future safety testing. Crit Rev Food Sci Nutr. 2016;56(3):512–26. https://doi.org/10.1080/10408398.2013.788994.
Segal D, Makris SL, Kraft AD, Bale AS, Fox J, Gilbert M, et al. Evaluation of the ToxRTool's ability to rate the reliability of toxicological data for human health hazard assessments. Regul Toxicol Pharmacol. 2015;72(1):94–101. https://doi.org/10.1016/j.yrtph.2015.03.005.
We would like to thank Mohamed Omar El-Badry, Faculty of Medicine, Al-Azhar University, Cairo, 11884, Egypt; Ashraf Ismail Sayed, Faculty of Medicine, Al-Azhar University, Assuit, Egypt; Ahmed Magdey Sayed, Faculty of Pharmacy, Al-Azhar University, Cairo, Egypt; Salma Y Fala, Faculty of Medicine, Suez Canal University, Ismailia, Egypt; Thi Nguyen Minh, Department of Radiology, Go Vap hospital, Ho Chi Minh, 70000, Viet Nam; Maha Elbadawy, Ministry of Health, Egypt; Mohamed Tamer Elhady, Department of Pediatrics, Zagazig University Hospitals, Faculty of Medicine, Sharkia, 44511, Egypt for their initial contribution to the study. We also thank Thuan Tieu, Faculty of Health Sciences, McMaster University, Hamilton, Canada, for checking English for the manuscript. We would like to thank Dr. Joseph Varney (American University of Caribbean School of Medicine, Cupecoy, Sint Maarten), a native English speaker, for proofreading the manuscript.
Ethics approval and consent to participate
Consent for publication
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Detailed search strategy for each database search.
List of excluded studies.
List of included articles using QA tools in in vitro SRs/MAs.
Detailed result of DARE assessment criteria of included studies.
The interpretation for fulfilling a “yes”, “partial” and “no” score.
About this article
Cite this article
Tran, L., Tam, D.N.H., Elshafay, A. et al. Quality assessment tools used in systematic reviews of in vitro studies: A systematic review. BMC Med Res Methodol 21, 101 (2021). https://doi.org/10.1186/s12874-021-01295-w