Skip to main content

Table 3 Machine learning studies

From: Current approaches to identify sections within clinical narratives from electronic health records: a systematic review

Reference

ML method

Training and test data set source

Training data set size

Test data set size

Method

Bramsen et al. [3]

AdaBoost

M

60

CV

ML

Haug et al. [17]

Bayesian Network

M

3483

CV

ML

Chen et al. [5], Dai et al. [7]

Conditional Random Fields

M, RB, CO

790

514

H

Deléger and Névéol [8]

Conditional Random Fields

M

100

600

ML

Ni et al. [40]

Conditional Random Fields and Maximum Entropy Classifier

M, AL

NS

NS

H

Jancsary et al. [20]

Conditional Random Fields and Viterbi

M, RB

2340

1003

H

Cho et al. [6]

Expectation Maximization Classifier

M, RB

NS

NS

H

Li et al. [29]

Hidden Markov Model and Viterbi

M, RB

7549

2130

ML

Lohr et al. [31]

Logistic Regression

M

1106

CV

ML

Ganesan and Subotin [16]

Logistic Regression and Viterbi

M, RB

1800

12502

H

Tepper et al. [57]

Maximum Entropy Classifier

M, CO

1365

374

ML

Sadoughi et al. [46]

Neural Network

M, RB

25842

2000

H

Apostolova et al. [1]

Support Vector Machine

M, RB

3000

200

H

Mowery et al. [39]

Support Vector Machine

M

50

CV

ML

Waranusast et al. [62]

Support Vector Machine and KNN

M

10694

CV

ML

  1. NS=Not Specified; Training and Test Data Set Source: M=Manually created, RB=Using a rule-based approach, CO= Using a data set provided by competition organizers, AC= Using an active learning strategy; Test Data Set Size: CV=Cross Validation; Method: ML=Machine Learning, H=Hybrid