Skip to main content

Table 3 Simulation studies: square root of MSE by different sample sizes

From: Prevalence estimation by joint use of big data and health survey: a demonstration study using electronic health records in New York city

Size of health Survey (n1)Size of subjects linked between two sources (n12)Health SurveyPost-stratified EHRMosteller estimatorSubject-level imputation model
250500.0330.0190.0260.049
1250.0310.0190.0240.046
2500.0300.0190.0230.040
5001000.0220.0190.0190.032
2500.0230.0190.0190.031
5000.0220.0190.0180.021
10002000.0160.0190.0140.027
5000.0150.0190.0150.022
10000.0160.0190.0150.014
  1. Prevalence (p1) measured in health survey (Y1) is fixed at 0.3 and the prevalence (p2) measured in EHR (Y2) is fixed at 0.32. The size of EHR (n2) is fixed at 100,000. Square root of MSE for estimating p1 is shown. The best performing method in each row is highlighted in bold