Skip to main content

Table 1 Simulation studies: prevalence estimate by four methods

From: Prevalence estimation by joint use of big data and health survey: a demonstration study using electronic health records in New York city

True Population Prevalence

Prevalence Estimate (95% CI)

Prevalence (p1) based on outcome in health survey (Y1)

Prevalence (p2) based on outcome in EHR (Y2)

Health Survey (n1 = 500)

Post-stratified EHR

Mosteller estimator

Subject-level imputation estimator

0.3

0.30

0.300

0.299

0.300

0.303

0.3

0.31

0.300

0.309

0.303

0.302

0.3

0.32

0.299

0.319

0.305

0.302

0.3

0.33

0.298

0.329

0.305

0.303

0.3

0.35

0.300

0.349

0.308

0.304

  1. The size of health survey (n1) and the size of subjects linked between two sources (n12) are both 500