An evaluation of DistillerSR’s machine learning-based prioritization tool for title/abstract screening – impact on reviewer-relevant outcomes

Hamel, C.; Kelly, S. E.; Thavorn, K.; Rice, D. B.; Wells, G. A.; Hutton, B.

doi:10.1186/s12874-020-01129-1

Table 1 Terminology and descriptions

From: An evaluation of DistillerSR’s machine learning-based prioritization tool for title/abstract screening – impact on reviewer-relevant outcomes

Terminology	Description
Estimated recall	The estimated percent of how many studies at title/abstract level have been identified among those that will be passed through to full-text screening. As this is calculated based on a set of records that have not been completely screened, the estimated recall may differ from the true recall.
Final include	A primary study included in the completed systematic review.
Iteration	A set of records that is used to assign a score around the likeliness of inclusion and prioritize the remaining unscreened records in order from highest relevance to lowest relevance.
Modified screening approach	An approach to modify how screening is being performed. For example, changing from: (i) dual-independent screening to liberal accelerated screening; (ii) dual-independent screening to single-reviewer screening; or (iii) assigning the remaining records to the AI reviewer to exclude, with a human reviewer(s) also screening these records as a second reviewer.
Prioritized screening	Through active machine learning, the presentation of records to reviewers is continually adjusted based on the AI’s estimated likelihood of relevance. The frequency of adjustment may differ by software application.
Screening burden	The total number of records at title/abstract to be screened.
Stop screening approach	An approach to screening whereby the remaining records are not screened once a certain threshold has been achieved (e.g., estimated recall @ 95%). These records are assumed to be excluded.
Record not yet identified [i.e., title/abstract false negative (FN)]	When an estimated recall (at any %) or true recall of less than 100% is used, these are the records that would have been included based on the title/abstract to be further reviewed at full-text screening, but were not yet identified. Had these records been screened at title/abstract and further screened based on the full text, they may have been excluded or included in the final review (i.e., a final include).
Title/abstract include [i.e., title/abstract true positive (TP)]	Records included based on the title/abstract to be further reviewed based on the full text. These records may then be excluded at full-text review or included in the final review.
Training set	One or more iterations which inform the machine learning to score and prioritize the remaining unscreened records.
Title/abstract exclude [i.e., true negative (TN)]	Records considered excluded based on title/abstract screening.
True recall	This is only known once all references have been screened and includes the percentage of the actual number of records that were title/abstract includes. True recall % calculated as: [title/abstract TP / (title/abstract TP + title/abstract FN)]

Back to article page

ISSN: 1471-2288

Contact us

General enquiries: journalsubmissions@springernature.com

BMC Medical Research Methodology

Contact us