Skip to main content

Table 4 Machine learning features

From: Current approaches to identify sections within clinical narratives from electronic health records: a systematic review

Reference

Lexical

Syntactical

Semantic

Contextual

Method

Bramsen et al. [3]

U,N

POS

RT,AT,T

LP

ML

Haug et al. [17]

N

NS

NS

NS

ML

Chen et al. [5], Dai et al. [7]

C,A

Pun

ST

WL

H

Deléger and Névéol [8]

NS

NS

NS

NS

ML

Ni et al. [40]

NS

NS

NS

NS

H

Jancsary et al. [20]

N

POS,Pun

ST

LP

H

Cho et al. [6]

NS

NS

ST

SS,OS

H

Li et al. [29]

N

NS

NS

SB

ML

Lohr et al. [31]

U

NS

NS

NS

ML

Ganesan and Subotin [16]

U,N,C

NS

ST

LP,LL,LC,CC

H

Tepper et al. [57]

U,C

Nu

NS

LP,WL,SS,SB

ML

Sadoughi et al. [46]

NS

NS

NS

SB

H

Apostolova et al. [1]

N,C

Pun

ST

LP,WL,SB

H

Mowery et al. [39]

U,N

POS,VT

ST,DI,MN

LP,LL,SB

ML

Waranusast et al. [62]

NS

NS

NS

SB

ML

  1. NS=Not Specified; Lexical: U=Unigram, N=N-gram, C= Capitalized, A=Affixes ; Syntactical: POS=Word Part of Speech, VT=Verb Tense, Pun=Punctuation, Nu=contains of begins with a number; Semantic: ST=Semantic Type (e.g. UMLS, LOINC), DI=De-identification tag, MN=Meaning of the number(e.g. phone, dosis), RT=is it a relative temporal word (e.g. later, next, until), AT=is it an absolute temporal word (e.g. am, pm), T=Topic of the section; Contextual: LP=Line position in the document, LL=Length of a line, WL=White lines before and after a line, LC=Length change from one line to another, SS=Section size, SB=Previous and following section boundaries, OS=Order of sections, CC= Capital and colon use; Method: ML=Machine Learning, H=Hybrid