Skip to main content

Table 3 Machine learning studies

From: Current approaches to identify sections within clinical narratives from electronic health records: a systematic review

Reference ML method Training and test data set source Training data set size Test data set size Method
Bramsen et al. [3] AdaBoost M 60 CV ML
Haug et al. [17] Bayesian Network M 3483 CV ML
Chen et al. [5], Dai et al. [7] Conditional Random Fields M, RB, CO 790 514 H
Deléger and Névéol [8] Conditional Random Fields M 100 600 ML
Ni et al. [40] Conditional Random Fields and Maximum Entropy Classifier M, AL NS NS H
Jancsary et al. [20] Conditional Random Fields and Viterbi M, RB 2340 1003 H
Cho et al. [6] Expectation Maximization Classifier M, RB NS NS H
Li et al. [29] Hidden Markov Model and Viterbi M, RB 7549 2130 ML
Lohr et al. [31] Logistic Regression M 1106 CV ML
Ganesan and Subotin [16] Logistic Regression and Viterbi M, RB 1800 12502 H
Tepper et al. [57] Maximum Entropy Classifier M, CO 1365 374 ML
Sadoughi et al. [46] Neural Network M, RB 25842 2000 H
Apostolova et al. [1] Support Vector Machine M, RB 3000 200 H
Mowery et al. [39] Support Vector Machine M 50 CV ML
Waranusast et al. [62] Support Vector Machine and KNN M 10694 CV ML
  1. NS=Not Specified; Training and Test Data Set Source: M=Manually created, RB=Using a rule-based approach, CO= Using a data set provided by competition organizers, AC= Using an active learning strategy; Test Data Set Size: CV=Cross Validation; Method: ML=Machine Learning, H=Hybrid