Skip to main content

Table 3 Simulation studies: square root of MSE by different sample sizes

From: Prevalence estimation by joint use of big data and health survey: a demonstration study using electronic health records in New York city

Size of health Survey (n1)

Size of subjects linked between two sources (n12)

Health Survey

Post-stratified EHR

Mosteller estimator

Subject-level imputation model

250

50

0.033

0.019

0.026

0.049

125

0.031

0.019

0.024

0.046

250

0.030

0.019

0.023

0.040

500

100

0.022

0.019

0.019

0.032

250

0.023

0.019

0.019

0.031

500

0.022

0.019

0.018

0.021

1000

200

0.016

0.019

0.014

0.027

500

0.015

0.019

0.015

0.022

1000

0.016

0.019

0.015

0.014

  1. Prevalence (p1) measured in health survey (Y1) is fixed at 0.3 and the prevalence (p2) measured in EHR (Y2) is fixed at 0.32. The size of EHR (n2) is fixed at 100,000. Square root of MSE for estimating p1 is shown. The best performing method in each row is highlighted in bold