Skip to main content

Table 2 Types of PHI and other data detected by the de-identification systems

From: Automatic de-identification of textual documents in the electronic health record: a review of recent research

De-identification system PHI Clinical data
  Person names Ages > 89 Geographical locations Hospitals/HC org. Dates Contact information IDs  
Aramaki P+D None
Beckwith P+D None
Berman UMLS
Fielstein P+D - None
Friedlin P+D None
Gardner P - - - None
Guo P+D None
Gupta P+D None
Hara P+D None
Morrison MedLEE
Neamatullah P+D None
Ruch P+D - - MEDTAG
Sweeney P+D None
Szarvas P+D None
Taira P - - - - - - None
Thomas P+D - - - - - - None
Uzuner P+D - None
Wellner P+D None
  1. ✸ Only extracted concepts (i.e. UMLS or other clinical concepts) are retained.
  2. P+D = Patient and healthcare provider names; P = Patient name