Skip to main content

Table 2 Inter-rater agreements

From: Methodological insights into ChatGPT’s screening performance in systematic reviews

Rater

Versus

Kappa (κ)

95% CI

GP 1

GP 2

0.47

0.39–0.55

GP 3

0.38

0.31–0.45

Expert 1

0.53

0.45–0.60

Expert 2

0.48

0.40–0.55

ChatGPT

0.28

0.24–0.33

GP 2

GP 3

0.51

0.43–0.59

Expert 1

0.60

0.52–0.67

Expert 2

0.57

0.49–0.65

ChatGPT

0.20

0.16–0.23

GP 3

Expert 1

0.66

0.59–0.72

Expert 2

0.59

0.52–0.65

ChatGPT

0.30

0.25–0.34

Expert 1

Expert 2

0.79

0.73–0.84

ChatGPT

0.29

0.25–0.34

Expert 2

ChatGPT

0.28

0.24–0.33