Skip to main content

Advertisement

Table 2 Performance of the first ranking attempt for each review

From: Can electronic search engines optimize screening of search results in systematic reviews: an empirical study

Review N of records retrieved from MEDLINE (d) N of included studies ranked in the top 500 (d) N of included indexed in MEDLINE (d) Proportion of records selected by the search engine (p = a/b) Recall (q = c/d) p-value
Gibbs [11] 5743 11 27 0.09 0.41 <0.001
Yeung [12] 4996 6 11 0.10 0.55 <0.001
Smeeth [17] 3119 4 5 0.16 0.80 0.003
Towheed [18] 1556 6 17 0.32 0.35 0.80
Shelley [19] 1486 5 5 0.34 1.00 0.004
Karjalainen [20] 1244 2 2 0.40 1.00 0.16
Malthaner [21] 2321 6 6 0.22 1.00 <0.001
Bowen [22] 4629 12 14 0.11 0.86 <0.001
Mulrow [23] 1405 36 39 0.36 0.92 <0.001
Overall 26499 88 136 0.17 0.70  
  1. Suppose there are b records in the initial retrieval. Suppose the top a (here we consider a = 500) of these records are selected by the search engine, i.e. a proportion p = a/b. Suppose further that this subset includes c of the d relevant records from the initial retrieval, i.e. a proportion q = c/d. If the search engine performs no better than would be expected by chance, then we would expect q = p. For each systematic review, we treated c as a binomial random variable with denominator d, and conducted a two-sided exact binomial test of the hypothesis that the expected value of q was equal to p.