Comparison of two Bayesian methods to detect mode effects between paper-based and computerized adaptive assessments: a preliminary Monte Carlo study

Table 1 Summary of CAT simulations by underlying measurement model, DIF size, mean CAT measures and percentage of DIF items

IRT Model	CAT to Full-Scale θCorrelation				Mean Standard Error
IRT Model	1PL		2PL		1PL		2PL
DIF Size	0.42	0.63	0.42	0.63	0.42	0.63	0.42	0.63
Diff. Mean θ = 0
DIF % = 10	0.97	0.97	0.97	0.97	0.26	0.26	0.23	0.24
DIF % = 30	0.97	0.96	0.97	0.97	0.26	0.26	0.24	0.24
Diff. Mean θ = 1
DIF % = 10	0.97	0.97	0.97	0.97	0.26	0.26	0.24	0.24
DIF % = 30	0.97	0.96	0.97	0.97	0.26	0.26	0.24	0.24
Average	0.97	0.97	0.97	0.97	0.26	0.26	0.24	0.24

1PL = one-parameter item response model; 2PL = two-parameter item response model; CAT = computerized adaptive testing; DIF = differential item functioning; IRT = item response theory; θ = person measure.

ISSN: 1471-2288