Skip to main content

Advertisement

Table 6 10 fold cross-validation recall results for partial and fully-contained matches by PHI type and using the VHA evaluation corpus

From: Evaluating current automatic de-identification methods with Veteran’s health administration clinical documents

10-fold cross-validation experiment
PHI type #Inst. PARTIAL MATCHES FULLY-CONTAINED MATCHES
   MIST HIDE MIST HIDE
Patient Name 206 0.51 0.51 0.49 0.51
Relative Name 30 0 0.13 0 0.10
Healthcare Provider Name 492 0.58 0.61 0.54 0.59
Other Person Name 20 0 0.20 0 0.20
Street City 137 0.28 0.48 0.28 0.43
State Country 161 0.58 0.71 0.58 0.70
Deployment 43 0.19 0.28 0.16 0.21
ZIP code 4 0 0 0 0
Healthcare Unit Name 1453 0.55 0.61 0.52 0.58
Other Org Name 86 0.10 0.29 0.09 0.25
Date 2547 0.93 0.94 0.89 0.92
Age > 89 4 0 0 0 0
Phone Number 90 0.27 0.88 0.23 0.78
Electronic Address 4 0.75 0.75 0 0.75
SSN 16 0.37 0.62 0.37 0.56
Other ID Number 123 0.37 0.72 0.34 0.65