Skip to main content

Table 6 10 fold cross-validation recall results for partial and fully-contained matches by PHI type and using the VHA evaluation corpus

From: Evaluating current automatic de-identification methods with Veteran’s health administration clinical documents

10-fold cross-validation experiment

PHI type

#Inst.

PARTIAL MATCHES

FULLY-CONTAINED MATCHES

  

MIST

HIDE

MIST

HIDE

Patient Name

206

0.51

0.51

0.49

0.51

Relative Name

30

0

0.13

0

0.10

Healthcare Provider Name

492

0.58

0.61

0.54

0.59

Other Person Name

20

0

0.20

0

0.20

Street City

137

0.28

0.48

0.28

0.43

State Country

161

0.58

0.71

0.58

0.70

Deployment

43

0.19

0.28

0.16

0.21

ZIP code

4

0

0

0

0

Healthcare Unit Name

1453

0.55

0.61

0.52

0.58

Other Org Name

86

0.10

0.29

0.09

0.25

Date

2547

0.93

0.94

0.89

0.92

Age > 89

4

0

0

0

0

Phone Number

90

0.27

0.88

0.23

0.78

Electronic Address

4

0.75

0.75

0

0.75

SSN

16

0.37

0.62

0.37

0.56

Other ID Number

123

0.37

0.72

0.34

0.65