Skip to main content

Table 2 Types of PHI and other data detected by the de-identification systems

From: Automatic de-identification of textual documents in the electronic health record: a review of recent research

De-identification system

PHI

Clinical data

 

Person names

Ages > 89

Geographical locations

Hospitals/HC org.

Dates

Contact information

IDs

 

Aramaki

P+D

✔

✔

✔

✔

✔

✔

None

Beckwith

P+D

✔

✔

✔

✔

✔

✔

None

Berman

✸

✸

✸

✸

✸

✸

✸

UMLS

Fielstein

P+D

-

✔

✔

✔

✔

✔

None

Friedlin

P+D

✔

✔

✔

✔

✔

✔

None

Gardner

P

✔

-

-

✔

-

✔

None

Guo

P+D

✔

✔

✔

✔

✔

✔

None

Gupta

P+D

✔

✔

✔

✔

✔

✔

None

Hara

P+D

✔

✔

✔

✔

✔

✔

None

Morrison

✸

✸

✸

✸

✸

✸

✸

MedLEE

Neamatullah

P+D

✔

✔

✔

✔

✔

✔

None

Ruch

P+D

-

-

✔

✔

✔

✔

MEDTAG

Sweeney

P+D

✔

✔

✔

✔

✔

✔

None

Szarvas

P+D

✔

✔

✔

✔

✔

✔

None

Taira

P

-

-

-

-

-

-

None

Thomas

P+D

-

-

-

-

-

-

None

Uzuner

P+D

-

✔

✔

✔

✔

✔

None

Wellner

P+D

✔

✔

✔

✔

✔

✔

None

  1. ✸ Only extracted concepts (i.e. UMLS or other clinical concepts) are retained.
  2. P+D = Patient and healthcare provider names; P = Patient name