Skip to main content

Table 1 Inter-rater reliability on the Newcastle-Ottawa Scale (NOS) assessments, by item

From: Newcastle-Ottawa Scale: comparing reviewers’ to authors’ assessments

Item

Agreement K(95% CI)

Interpretation [8]

0 point differencec

±1 point differencec

> ±2 points differencec

Representativeness of the exposed cohort

0.03 (−0.10, 0.15)

Slight

43 (66.2%)

22 (33.8%)

0 (0%)

Selection of the non-exposed cohort

0.00 (0.00, 0.00)

Slight

53 (81.5%)

12 (18.5%)

0 (0%)

Ascertainment of exposure

−0.02 (−0.08, 0.04)

Poor

12 (18.5%)

53 (81.5%)

0 (0%)

Demonstration that outcome of interest was not present at start of study

0.09 (−0.16, 0.35)

Slight

47 (72.3%)

18 (27.7%)

0 (0%)

Comparability

0.00a (−0.11, 0.12)

Slight

38 (58.5%)

18 (27.7%)

9 (13.8%)

Assessment of outcome

−0.04 (−0.09, 0.00)

Poor

59 (90.8%)

6 (9.2%)

0 (0%)

Was follow-up long enough for outcomes to occur

−0.06 (−0.22, 0.10)

Poor

31 (47.7%)

34 (52.3%)

0 (0%)

Adequacy of follow-up of cohorts

0.15 (−0.19, 0.48)

Slight

57 (87.7%)

8 (12.3%)

0 (0%)

Total NOS score

−0.004a (−0.11, 0.11)

Poor

15 (23.1%)

20 (30.8%)

30 (46.1%)

Total categorized NOS score

0.14b (−0.02, 0.29)

Slight

44 (67.7%)

21 (32.3%)

0 (0%)

  1. Abbreviation: 95% CI = 95% confidence interval.
  2. aLinear weighted kappa was used for both Comparability and Total NOS score; other kappas were not weighted (i.e., Cohen’s kappa was applied).
  3. bQuadratic weighted kappa was used assuming the difference between very high risk, high risk and low risk were comparably unequal.
  4. cNumber of studies with a 0, ±1, or more than ±2 points difference between reviewer and author assessments, separated by item.