Skip to main content

Table 3 Intra-rater agreement between two assessments over time of individual raters and of all raters pooled

From: Quality assessment of expert answers to lay questions about cystic fibrosis from various language zones in Europe: the ECORN-CF project

Content quality

 

Complete congruence

Discrepancy 1 grade

Discrepancy 2 grades

weighted kappa ± standard error

Rater 1

11/23

11/23

1/23

0.020 ± 0.172

 

(48%)

(48%)

(4%)

p = 0.909

Rater 2

19/25

6/25

0/25

0.559 ± 0.130

 

(76%)

(24%)

(0%)

p < 0.001

Rater 3

17/25

8/25

0/25

0.460 ± 0.123

 

(68%)

(32%)

(0%)

p < 0.001

Rater 4

13/22

8/22

1/22

0.236 ± 0.152

 

(59%)

(36%)

(5%)

p = 0.120

Mean all raters

21/25

4/25

0/25

0.669 ± 0.149

 

(84%)

(16%)

(0%)

p < 0.001

Formal quality

 

Complete congruence

Discrepancy 1 grade

Discrepancy 2 grades

weighted kappa ± standard error

Rater 1

10/25

12/25

3/25

0.145 ± 0.150

 

(40%)

(48%)

(12%)

p = 0.336

Rater 2

20/25

4/25

1/25

0.650 ± 0.133

 

(80%)

(16%)

(4%)

p < 0.001

Rater 3

13/25

11/25

1/25

0.147 ± 0.161

 

(52%)

(44%)

(4%)

p = 0.360

Rater 4

11/22

8/22

3/22

0.000 ± 0.000

 

(50%)

(36%)

(14%)

p = 1

Rater 5

19/24

4/24

1/24

0.410 ± 0.249

 

(79%)

(17%)

(4%)

p = 0.100

Mean all raters

13/25

11/25

1/25

0.169 ± 0.150

 

(52%)

(44%)

(4%)

p = 0.260

  1. 25 expert answers were scored at two different points in time by the same raters. The numbers represent the number of expert answers (percentage in brackets) that were scored with the same grade twice (complete congruence), that were scored the second time one grade lower or higher than the first time (discrepancy 1 grade) and that were scored two grades lower or higher the second time (discrepancy 2 grades). If the total number of expert answers is lower than 25, the respective rater regarded some of the expert answers as "unscorable". Rater 5 as a representative of the German CF-patient organization and not a care team member scored only the formal aspect of answers. The kappa values for agreement were interpreted according to the scale by Landis and Koch [13]: agreement poor < 0; slight 0.00-0.20; fair 0.21-0.40; moderate 0.41-0.60; substantial 0.61-0.80; almost perfect 0.81-1.00). A p value of p < 0.05 was regarded to be significant.