Skip to main content

Table 4 Univariate and multivariate multilevel logistic regression to predict incorrect detection of mode effects defined by Robust Z and Bayesian 95% credible interval as a function of study variables

From: Comparison of two Bayesian methods to detect mode effects between paper-based and computerized adaptive assessments: a preliminary Monte Carlo study

Model/Predictor

Univariate

Multivariate

OR

AUC a

OR

95% CI

Robust Z (Model AUC = 0.77)

Size of DIF

1.93**

0.55

2.01**

(1.36,2.97)

Percentage of DIF

1.44

0.55

1.48

(0.99,2.20)

2PL IRT Model

1.14

0.52

0.96

(0.63,1.46)

Diff. Mean θ = 1.0

3.31**

0.59

3.95**

(2.56,6.08)

CAT Item Usage

1.91**

0.54

4.17**

(3.11,5.60)

Item Difficulty

0.28**

0.62

0.12**

(0.08,0.19)

Item Discrimination

1.64**

0.56

1.23

(0.96,1.58)

Bayesian 95% Credible Interval (Model AUC = 0.74)

Size of DIF

1.62*

0.55

1.61**

(1.20,2.15)

Percentage of DIF

1.33

0.53

1.30

(0.97,1.75)

2PL IRT Model

1.14

0.52

1.08

(0.80,1.47)

Diff. Mean θ = 1.0

1.28E+08**

0.62

4.01**

(2.90,5.55)

CAT Item Usage

0.96

0.44

2.36**

(1.82,3.06)

Item Difficulty

0.30**

0.65

0.16**

(0.11,0.22)

Item Discrimination

1.19

0.51

1.02

(0.82,1.26)

  1. Incorrect detection of mode effects = False positive identification of DIF due to mode among items not simulated with mode DIF; AUC = area under the ROC curve; CI = 95% confidence interval; IRT = item response theory model used to generate response data and parameters used in CAT; CAT item usage = number of times a given item was administered by CAT divided by 100; * p < .05; ** p < .01.