Skip to main content

Table 2 Performance of the first ranking attempt for each review

From: Can electronic search engines optimize screening of search results in systematic reviews: an empirical study

Review

N of records retrieved

from MEDLINE (d)

N of included studies ranked

in the top 500 (d)

N of included indexed

in MEDLINE (d)

Proportion of records selected

by the search engine (p = a/b)

Recall (q = c/d)

p-value

Gibbs [11]

5743

11

27

0.09

0.41

<0.001

Yeung [12]

4996

6

11

0.10

0.55

<0.001

Smeeth [17]

3119

4

5

0.16

0.80

0.003

Towheed [18]

1556

6

17

0.32

0.35

0.80

Shelley [19]

1486

5

5

0.34

1.00

0.004

Karjalainen [20]

1244

2

2

0.40

1.00

0.16

Malthaner [21]

2321

6

6

0.22

1.00

<0.001

Bowen [22]

4629

12

14

0.11

0.86

<0.001

Mulrow [23]

1405

36

39

0.36

0.92

<0.001

Overall

26499

88

136

0.17

0.70

 
  1. Suppose there are b records in the initial retrieval. Suppose the top a (here we consider a = 500) of these records are selected by the search engine, i.e. a proportion p = a/b. Suppose further that this subset includes c of the d relevant records from the initial retrieval, i.e. a proportion q = c/d. If the search engine performs no better than would be expected by chance, then we would expect q = p. For each systematic review, we treated c as a binomial random variable with denominator d, and conducted a two-sided exact binomial test of the hypothesis that the expected value of q was equal to p.