Abstract analysis method facilitates filtering low-methodological quality and high-bias risk systematic reviews on psoriasis interventions

Background Article summaries’ information and structure may influence researchers/clinicians’ decisions to conduct deeper full-text analyses. Specifically, abstracts of systematic reviews (SRs) and meta-analyses (MA) should provide structured summaries for quick assessment. This study explored a method for determining the methodological quality and bias risk of full-text reviews using abstract information alone. Methods Systematic literature searches for SRs and/or MA about psoriasis were undertaken on MEDLINE, EMBASE, and Cochrane database. For each review, quality, abstract-reporting completeness, full-text methodological quality, and bias risk were evaluated using Preferred Reporting Items for Systematic Reviews and Meta-analyses for abstracts (PRISMA-A), Assessing the Methodological Quality of Systematic Reviews (AMSTAR), and ROBIS tools, respectively. Article-, author-, and journal-derived metadata were systematically extracted from eligible studies using a piloted template, and explanatory variables concerning abstract-reporting quality were assessed using univariate and multivariate-regression models. Two classification models concerning SRs’ methodological quality and bias risk were developed based on per-item and total PRISMA-A scores and decision-tree algorithms. This work was supported, in part, by project ICI1400136 (JR). No funding was received from any pharmaceutical company. Results This study analysed 139 SRs on psoriasis interventions. On average, they featured 56.7% of PRISMA-A items. The mean total PRISMA-A score was significantly higher for high-methodological-quality SRs than for moderate- and low-methodological-quality reviews. SRs with low-bias risk showed higher total PRISMA-A values than reviews with high-bias risk. In the final model, only ’authors per review > 6’ (OR: 1.098; 95%CI: 1.012-1.194), ’academic source of funding’ (OR: 3.630; 95%CI: 1.788-7.542), and ’PRISMA-endorsed journal’ (OR: 4.370; 95%CI: 1.785-10.98) predicted PRISMA-A variability. Reviews with a total PRISMA-A score < 6, lacking identification as SR or MA in the title, and lacking explanation concerning bias risk assessment methods were classified as low-methodological quality. Abstracts with a total PRISMA-A score ≥ 9, including main outcomes results and explanation bias risk assessment method were classified as having low-bias risk. Conclusions The methodological quality and bias risk of SRs may be determined by abstract’s quality and completeness analyses. Our proposal aimed to facilitate synthesis of evidence evaluation by clinical professionals lacking methodological skills. External validation is necessary. Electronic supplementary material The online version of this article (doi:10.1186/s12874-017-0460-z) contains supplementary material, which is available to authorized users.

(Continued from previous page) with a total PRISMA-A score ≥ 9, including main outcomes results and explanation bias risk assessment method were classified as having low-bias risk. Conclusions: The methodological quality and bias risk of SRs may be determined by abstract's quality and completeness analyses. Our proposal aimed to facilitate synthesis of evidence evaluation by clinical professionals lacking methodological skills. External validation is necessary.
Keywords: Systematic review, Methodological quality, Quality of reporting, AMSTAR, PRISMA for abstracts, Abstract readability, Psoriasis, Decision trees

Background
Therapeutic decision-making processes should be based on the best available evidence [1]. Documents that synthesise evidence concerning a particular subject facilitate access to such information for the consumers of the product in question (physicians, pharmacists, hospital committees, regulatory organisations). Systematic reviews (SRs) are the standard documents that provide syntheses of evidence. Their conclusions are often used as a starting point for the development of clinical practice guidelines, and also for establishing recommendations concerning diagnostic, prognostic, and/or therapeutic interventions [2]. However, applying the information contained within these documents requires authors to follow rigorous procedures to ensure adequate methodological quality is present, minimise the risk of bias, and facilitate reporting and dissemination. A large number of primary studies and evidence-synthesis documents have been published to date, but many are redundant, do not reach the necessary methodological quality, or have a high risk of bias [3]. Considering this situation, it is not easy for consumers to identify synthesis documents that are of good quality and have a low risk of bias.
Psoriasis is a chronic disease, with moderate and severe forms associated with significant comorbidity, impaired quality of life, and high direct and indirect costs [4]. An increasing number of elective therapies have been developed during the last decade, but these usually have potentially significant adverse side effects and high costs, which puts patients at risk and brings the sustainability of the health systems into question [5,6]. Assessing full-text documents using Assessing the Methodological Quality of Systematic Reviews (AMSTAR) and Risk of Bias in Systematic Reviews (ROBIS) tools, we recently observed that most SRs relating to interventions in psoriasis are of low methodological quality (28.8%) and have a high bias risk (86%) [7]. However, it is impractical to suggest that interested parties apply this same method to assess the methodological quality and the risk of bias of SRs, as it is a time-consuming process that requires systematic literature searching, abstract screening, and full, in-depth manuscript assessment; further, two or more evaluators are required to control for rating discrepancies [8]. In the recent years, efforts have been made to automate some steps towards SR development. In this sense, machine learning resources have been evaluated to assist the conduction of SRs [9] as well as for assessing the risk of bias of SRs [10].
In 2013, the Preferred Reporting Items for Systematic Reviews and Meta-analyses for Abstracts (PRISMA-A) was published, featuring guidelines concerning methods of writing and presenting abstracts for systematic reviews and meta-analyses [10]. PRISMA-A is a checklist developed to help authors report all types of SRs, although it mainly relates to SRs concerning evaluations of interventions in which one or more meta-analyses are conducted. This tool features 12 items related to information that should be provided in order to present the methods, results, and conclusions in a manner that accurately reflects the core components of the full review. However, the relationship between the reporting quality of such abstracts, the methodological quality of the full texts, and the risk of bias in these texts is still unknown.
Thus, the primary objective of our study is to apply PRISMA-A to evaluate the reporting quality of SR abstracts relating to psoriasis interventions. Our secondary objective is to determine if this instrument indirectly captures the methodological quality of and the risk of bias in the full reviews, which we measured using AMSTAR and ROBIS instruments. Finally, we discuss our attempt to develop classification algorithms using PRISMA-A that can provide deeper analysis of reviews based only on abstract data.

Protocol and elegibility criteria
To begin, we established an a priori protocol to evaluate AMSTAR vs ROBIS in which we predict the measurement of compliance with PRISMA-A and published it in the PROSPERO International Prospective Register of Systematic Reviews (PROSPERO 2016: CRD42016053181). In this protocol, we included SRs or MAs published in scientific journals that related to interventions in skin psoriasis. Historical articles, abstracts of congresses, case reports, surveys, narrative reviews, narrative reports (i.e., reports that have a particular focus on understanding a concept), clinical practice guidelines, consensus documents, MAs performed without a systematic literature search, and reviews titled as literature reviews or integrative reviews were not included. Further, as a result of the time limitation on completing the project, the documents retrieved were restricted to English-language reviews. There was no limitation on the year of publication or study population.

Search and selection methods
As a systematic literature search was conducted in a previous study and, taking the results listed, we filtered them to include only those published by July 5th 2016 [7]. Then, new SRs and MAs published by January 2017 were identified using MEDLINE, EMBASE, and the Cochrane Database. Details regarding the search methods applied for identifying and selecting these documents are provided in Additional file 1.

Quality assessment of abstract reporting
Two investigators (JL-HR and JL-SC) independently assessed the abstract-reporting quality of each review; they used the same data abstraction forms for each review and were blinded to the names of the journals, the authors, and the authors' affiliations. As mentioned above, we applied PRISMA-A, a checklist designed to determine if the content of an SR abstract is truthful, to assess reviews of psoriasis interventions [11]. PRISMA-A features a 12-item checklist concerning information that should be provided in SR abstracts; specifically, these are: title; objectives; the eligibility criteria of included studies; information sources, including key databases and dates of searches; methods of assessing bias risk; number and type of included studies; synthesis of results for main outcomes; description and direction of the effect; summary of the strengths and limitations of the evidence; general interpretation of results; funding sources, and registration number.

Methodological quality of SRs
Two investigators (FG-G and JG-M) independently assessed the methodological quality of each review using AMSTAR tool; again, these investigators were blinded to the names of the journals, names of the authors, and authors' criteria. In the case of a disagreement, an independent researcher (JR) was consulted. Review quality was classified by total AMSTAR score following one of the most used cutoff points for AMSTAR levels [for low (0-4), moderate (5)(6)(7)(8), and high methodological quality (9-11) respectively [12]. Detailed information about the AMSTAR checklist and the system of rating the articles are presented in Additional file 2.

Bias risk of SRs
Two investigators (FG-G and MA-L) independently assessed the bias risk of each review using the same data abstraction forms for each and while being blinded to the names of the journals, the names of the authors, and the authors' affiliations; specifically, we used ROBIS, which features a four-stage approach, to assess this bias risk [11]. ROBIS is conducted over three phases. Phase 1 involves assessing the relevance of the review, and is considered optional. Phase 2 includes four domains: 1) study eligibility criteria, 2) identification and selection of studies, 3) data collection and study appraisal, and 4) synthesis and findings. Finally, phase 3 assesses the overall risk of bias in the interpretation of the review findings and whether limitations identified in any of the phase domains have been considered. To simplify analyses, SR that were rated to have an unclear risk of bias using ROBIS tool were discussed with a third evaluator to take the final decision to categorize them in the group of high or low risk bias. Recently, good validity, reliability and applicability of ROBIS tool have been demonstrated [13]. Detailed information about the ROBIS tool and the system of rating are presented in Additional file 3.

Data extraction and statistical analysis
For studies that fulfilled the inclusion criteria, five investigators (FG-G, JG-M, PA-M, JLS-C, and MG-P) independently obtained metadata from each. Studies were then classified as Cochrane or non-Cochrane reviews. Cochrane affiliation was defined for authors of Cochrane Reviews published at the Cochrane Database of Systematic Reviews (CDSR) and authors using a Cochrane group name even if the paper was not published at CDSR. PRISMA-A results are represented on Likert scales as percentages of achievement per item. PRISMA-A results are also summarised on Likert scales in regard to methodological quality and risk of bias. Total and by item interrater reliability (IRR) of PRISMA-A was assessed using the irr R package. Differences in the mean total of PRISMA-A scores when comparing methodological quality and risk of bias levels were assessed using the Kruskal-Wallis and Wilkoxon tests, respectively. Evidence against the null hypothesis was considered for a two-tailed p value of < 0.05. Further, generalised linear models were obtained using the median total PRISMA-A score as the dependent variable. Adjustments were made for several metadata: actual observed 'abstract word count' (≤ 300 versus >300), 'abstract format' (8-headings, IMRAD, and free format), 'Cochrane affiliation authors' , 'number of authors' (≤ 6 versus >6), 'number of authors with conflict of interest' , 'source of funding' (pharma, academic or none/UNK), 'PRISMA endorser journal' ('yes' versus 'no'), 'PRISMA-A statement' (review published before or after 2013), and 'journal impact factor' . The 'IMRAD' format include: introduction, methods, results, and discussion. The '8headings abstract' format includes: background, objectives, search methods, selection criteria, data collection, analysis, main results, and author's conclusions. We checked the list of journals endorsing PRISMA at the PRISMA web (URL: http://www.prisma-statement.org/ Endorsement/PRISMAEndorsers.aspx). Multivariate predictive model was created including those variables that were statistically significant in the univariate predictive models (p <0.05). Recursive partitioning of our dataset helped us to develop easily visualised decision rules for predicting the methodological quality of SRs based on abstract analysis. Next, two classification trees were created for methodological quality ('high' and 'moderate' levels were recoded as 'high-moderate' in order to produce a simpler model with a binary response) and risk of bias. Decision trees were obtained using the rpart R package that implements several algorithms. Cut off points were obtained as results of complex internal processes of these algorithms, and therefore they were not selected by the authors. We used cross-validation method to evaluate predictive accuracy of our model as compared with the rest of tree models. We have performed sensitivity analysis for both AMSTAR and ROBIS classification trees by random selection of the training dataset to build 2.000 models in each case. Values of 'variable importance' parameter obtained for every node and model were plotted. Graphs were produced and statistics were analysed using several packages of R language (R Development Core Team).

Protocol vs. overview
Our planned search strategy was recorded in PROSPERO and was compared with the final reported review methods. We decided to use the machine learning classification procedure to obtain classification trees based on PRISMA-A after our protocol was published.

Review selection
Our new database search (from July 5th 2016 to January 1st 2017) yielded 161 titles with potential relevance (125 from EMBASE & MEDLINE, 10 from EMBASE only, three from MEDLINE only, and 23 from the Cochrane Database). After excluding duplicated articles and screening titles and abstracts, 44 new studies were judged to be potentially eligible for full-text review, and after assessment, final reviews were added to the previously obtained 119 reviews (Fig. 1). Thus, 139 reviews comprising 4357 primary studies about interventions in psoriasis were published by 62 journals from 1997 to 2017. Lists of included and excluded articles are shown in Additional files 4 and 5.

Reporting characteristics of SRs
The interrater reliability (IRR) of both raters for total score was substantial (κ = 0.77; 95% CI, 0.59-0.88). IRR was highest for question PEA1 (κ = 0.86) and lowest for question PEA8 (κ = 0.08) (Additional file 6). As shown in Fig. 2, of the 12 PRISMA-A items, there were three items for which more than 90% of the included reviews received a 'yes' rating: item 2 (objectives; 94.9%), item 10 (interpretation of results; 94.1%), and item 1 (description of the effect; 93.4%). However, less than 50% of the SRs fulfilled the criteria for item 5 (risk of bias; 23.3%) and item 9 (strengths and limitations of evidence; 27%). Finally, almost none of the SR abstracts fulfilled item 12 (registration; 1.4%) or item 11 (funding; 0.7%). Considering item ratings for each SR, six of the 139 reviews received a 'yes' rating for 10 or 11 of the 12 PRISMA-A items. The median number of fulfilled items for each review was six (range: 2-11).

Reporting quality and risk of bias
For reviews with a high risk of bias, the median number of PRISMA-A items with a 'yes' rating was six (2-10). Interestingly, for reviews with low bias risk, the minimum number of items with a 'yes' rating was also six ( Table 1). Fig. 3a-b shows PRISMA-A Likert scales in which the percentage of achievement per item for high-bias-risk SRs was compared with reviews that had low bias risk; this was performed using the ROBIS tool. Overall, the response profiles are quite similar, with only a slight increase of compliance found in the low-bias-risk subgroup for the 'interpretation' , 'funding' , and 'registration' items. Lastly, SRs with a low risk of bias showed higher total PRISMA-A values than reviews with high bias risk (7.7 ± 1.26 vs 6.75 ± 1.59, p =0.012) (Additional file 7).

Reporting quality and methodological quality
Figure 3c-e presents the percentage of achievement per PRISMA-A item, comparing SRs classified using the AMSTAR instrument (as high, moderate, or low methodological quality). In this case, unlike the findings concerning the bias-risk subgroups, there are different patterns for each level of methodological quality. For high-methodological-quality reviews, the median number of items with a 'yes' rating was eight (6)(7)(8)(9)(10)(11), with six (4-10) and five (2-8) for moderate and low quality reviews, respectively (Table 1). Item 5, 'risk of bias' , showed the widest variation between the subgroups, and items 10 ('funding') and 11 ('registration') displayed minimal variation. Lastly, the mean total PRISMA-A score was significantly higher for SRs with high methodological quality than for moderate (7.73 ± 0.13 vs 7.05 ± 0.13, p = 0.031) and low methodological quality (7.73 ± 0.13 vs 5.77 ± 0.13, p = 0.001) (Additional file 8).

Factors influencing reporting quality
Univariable and multivariable logistic ordinal regressions were performed in order to predict PRISMA-A results

Classification trees for SRs methodological quality prediction based on abstract reporting assessment
We used classification trees as a visual tool with which to gain an idea of the abstract-related variables that are important for predicting SRs with low methodological  Essentially, Fig. 4 shows that abstracts that had a total PRISMA-A score of less than six, lacking any identification in the title of being an SR or MAs, as well as lacking an explanation of the methods applied for assessing bias risk, were classified using AMSTAR as having lowmethodological quality with a root node error of 0.15 and a misclassification rate of 22.6% in the cross-validation.
In Fig. 5, abstracts with a total PRISMA-A score equal to or higher than nine which included the results of the main outcomes and an explanation concerning the methods used for assessing bias risk were classified as having low-bias risk, with a root node error of 0.14 and a misclassification rate of 20.6% in the cross-validation. We found that the nodes included in our tree models were also at the top ranking of nodes when ordered by median importance after sensitivity analysis (Additional files 9 and 10). Overall, a higher dispersion of 'variable importance' values of AMSTAR-derived trees as compared with ROBIS trees suggests that AMSTAR classification tree is less robust than ROBIS classification tree.

Main findings
To the best of our knowledge, this is the first study to evaluate the capacity of PRISMA-A to determine the methodological quality and bias risk of SRs or MAs relating to psoriasis interventions. In short, this study suggests that the reporting quality of abstracts of reviews published concerning psoriasis interventions is suboptimal. Overall, the average percentage of PRISMA-A items featured in each abstract was 50-67%. While 'objectives' , 'interpretation of results' , and 'description of effect' were included in almost all abstracts, the majority failed to adequately report 'strengths and limitations' and 'risk of bias'; furthermore, registration numbers and disclosures of sources of funding were almost universally absent. We found that methodological quality and risk of bias, assessed using AMSTAR and ROBIS instruments, correlated positively with the PRISMA-A evaluations of the quality and completeness of abstract reporting. Previous studies have supported the theory that improving the abstract quality of SRs may provide a more accurate reflection of their methodological quality. Previous studies, applying AMSTAR, evaluated the quality of SRs with regard to adherence to PRISMA statements, and found that PRISMA endorsement enhanced compliance with AMSTAR scale items in gastroenterology/hepatology and surgical journals [14,15]. Further, using a masked randomised trial, Cobo et al. analysed the feasibility of using CONSORT-and STROBE-reporting guidelines to support the peer-review process performed by a general medicine journal editorial team [16]. Moreover, Rice et al., using AMSTAR, found a positive correlation between the overall quality ratings of SRs with MAs and the number of PRISMA-A items adequately reported [17].
The above findings are similar to our own, as we also found that the methodological quality of reviews assessed using the AMSTAR instrument correlated positively with PRISMA-A evaluations of the quality and completeness of abstract reporting. However, no study has yet been published presenting a significant correlation between PRISMA-A compliance and risk of bias; in our study, significant differences in terms of abstract quality were observed between SRs with high and low bias risk.

Strengths and limitations
In this study, we explored, for the first time, the capacity of PRISMA-A to determine both the methodological quality and the bias risk of full-text reviews using ROBIS and AMSTAR tools. Our study includes a large sample of over 15 years of reviews (n = 139) concerning interventions in psoriasis. The study was performed using a systematic search strategy and following an a priori protocol published in PROSPERO; the AMSTAR and ROBIS assessments were performed independently by two authors, and there were few disagreements during the process, all of which were solved through discussion. Nevertheless, our study has some limitations. First, this study only featured SRs and MAs relating to interventions in psoriasis, so there is a limitation in terms of the generalisability of the data, as we did not compare our results to reviews conducted in relation to other diseases or areas of healthcare. Second, the search was restricted to MEDLINE, EMBASE, and the Cochrane database; this was because our intention was to obtain a representative sample of published systematic reviews concerning psoriasis interventions, rather than cover all such reviews. We did not search for SRs in grey literature databases, and, therefore, we cannot establish differences in terms of methodological quality and risk of bias with respect to those that were examined. Third, during the cross-validation, we found a misclassification rate of 20-22%; this means that for one in five abstracts, the methodological quality and risk of bias are mistakenly classified. To rectify this, we would require external validation to test the performance of our models with other datasets. In any case, a desirable improvement in the quality of reporting could result in the disambiguation of many SRs classified as having moderate quality and causing level overlapping during the cross-validation. Fourth, a limitation of this work is that different reviewers applied PRISMA-A, AMSTAR and ROBIS. Only one of threes raters carried out the evaluations both with AMSTAR and ROBIS tools. Although their results were compared in pairs and discrepancies were discussed with a fourth rater, there is a risk that this issue will affect the validity of our results. Finally, it is a limitation not to have considered the year in which journals are endorsing PRISMA-A and there is a risk of bias in this regard.

Our findings in context
Our findings were similar to those of a previous study conducted by Bigna et al. In this latter study, the authors found that the quality of reporting was declining in terms of the 'strength and limitations of evidence' and 'funding' of reviews [18]. Further, Tsou et al. used PRISMA-A to analyse 200 randomly selected abstracts of SRs relating to health interventions and found that less than 50% of the abstracts contained information concerning the 'risk of bias assessment' (23%), 'study protocol registration' (2%), and 'funding source' (1%) [19]. Moreover, Seehra et al. studied the reporting completeness of abstracts of SRs published in dental speciality journals [20]. They developed a check list that included several items from different sources: PRISMA statement guidelines [21], the Cochrane Handbook for Systematic Reviews of Interventions [22], and the paper by Beller et al. [14]. We did not find quality of reporting differences between reviews as they were published before vs after PRISMA-A statement. Our results are similar to those found by Panic et al. [23].
These authors demonstrated that the quality of reporting improved only sub-optimally in the years following the publication of PRISMA.
The capacity of abstract extension to predict PRISMA-A variability has also been addressed in other studies. Interestingly, the number of words per summary explains a very small part of it [17], and even better reporting results were observed for abstracts with < 300 words [16]. In the latter study, a better abstract structure (8-headings vs IMRAD formats) also predicted an improved reporting quality. These results are similar to ours and suggest that abstract systematization and concretion are more important than its extension to define the quality of the summary report.

Implications of results
Motivated by the possibility of capturing through an abstract, at least in part, the methodological quality and the risk of bias of a study, which are normally evaluated using information contained in the full text of the document, we explored the possibility of obtaining simplistic and feasible decision models that are easy to interpret and intuitive to follow. Our method is offered as a support to decision making and does not intend to replace the rigorous final analysis of each synthesis document, but it allows to prioritize in a simple and rapid way those documents obtained in a first search by professionals not experts in this type of methodology. We believe that the information contained in the abstract is a good source that can allow us to work in this sense and this is the original contribution that we make. The importance of our proposed tree models lies in their capacity to assist in abstract filtering using just the predicted methodological quality and bias risk determined through PRISMA-A abstract analysis, which is a more feasible instrument than the AMSTAR or ROBIS tools. Our decision trees have been constructed using a machine learning tool. This type of technology is currently being used to systematize some aspects of RS such as article selection or risk assessment bias [9]. We believe that the association of validated tools that measure quality or bias risk and machine learning technology may improve methodological assessment processes. Better meta-epidemiological knowledge together with the development of text mining strategies will allow to develop models that help clinicians to simplify making decisions at clinical setting. Finally, the final classification determined in both decision trees is congruent with the idea that methodological quality explains only part of the risk bias of SRs, as we found the degree of compliance with PRISMA-A required to predict SRs of low risk bias is greater than that required to predict high-methodological-quality SRs. Therefore, we can conclude that the methodological quality and the risk of bias of SRs may be captured by analysing the quality and completeness of abstract reporting, and that by applying our decision tree models, the review-filtering process may be improved through rapid abstract analysis.