Skip to main content

Table 2 True positive and false positive rates as a function of generating IRT model, DIF size, number of DIF items, and mean difference between modes

From: Comparison of two Bayesian methods to detect mode effects between paper-based and computerized adaptive assessments: a preliminary Monte Carlo study

IRT Model

DIF Size

DIF %

Diff. Mean θ

Robust Z

Bayes 95% CrI

TP%

FP%

TP%

FP%

1PL

0.42

10

0

54.90

0.91

60.35

1.36

1

70.00

2.36

66.35

3.88

30

0

69.66

0.29

70.67

0.44

1

71.09

1.05

71.91

2.58

0.63

10

0

78.00

0.56

83.00

0.89

1

75.00

3.56

81.00

5.22

30

0

82.67

2.57

87.00

3.00

1

76.33

2.14

79.33

3.29

2PL

0.42

10

0

60.82

0.06

60.42

0.06

1

58.24

2.09

55.10

3.89

30

0

62.71

0.14

66.09

0.28

1

66.00

4.86

66.33

6.86

0.63

10

0

72.00

0.33

77.00

0.55

1

70.00

2.39

77.00

3.56

30

0

72.33

3.14

77.67

3.14

1

66.33

3.33

69.00

5.00

Average

69.13

1.86

71.76

2.75

  1. 1PL = one-parameter item response model; 2PL = two-parameter item response model; CAT = computerized adaptive testing; DIF = differential item functioning; DIF% = percentage of items simulated with DIF; FP% = percentage of false positive DIF results; IRT = item response theory; TP% = percentage of true positive DIF results; θ = person measure.