Creating efficiencies in the extraction of data from randomized trials: a prospective evaluation of a machine learning and text mining tool

Table 3 Relevance of the highlighted text fragments among relevant sentences^a

Report section	Data element	Relevant sentences, n Total^b	Fragments, n total^c	Relevant fragments, n (%)^d	Exact matches, n (%)^d	Partial matches, n (%)^d
Meta information	Funding source	79	157	124 (79.0)	24 (15.3)	100 (63.7)
	Funding number	35	54	44 (81.5)	27 (50.0)	17 (31.5)
	Registration number	63	104	104 (100.0)	103 (99.0)	1 (1.0)
Enrollment	Eligibility criteria	110	0	0 (0.0)	0 (0.0)	0 (0.0)
	Sample size	125	125	110 (88.0)	92 (73.6)	18 (14.4)
	Enrollment start date	55	51	51 (100.0)	43 (84.3)	8 (15.7)
	Enrollment end date	56	50	47 (94.0)	45 (90.0)	2 (4.0)
	Early stopping	7	3	3 (100.0)	1 (33.3)	2 (66.7)
Intervention	Experimental arm(s)	123	133	74 (55.6)	15 (11.3)	59 (44.4)
	Control arm(s)	121	62	55 (88.7)	34 (54.8)	21 (33.9)
	Route of administration	32	34	30 (88.2)	27 (79.4)	3 (8.8)
	Dose	50	77	70 (90.9)	26 (33.8)	44 (57.1)
	Frequency of administration	45	61	55 (90.1)	44 (72.1)	11 (18.0)
	Duration of treatment	57	56	48 (85.7)	35 (62.5)	13 (23.2)
Outcome	Primary outcome(s)	95	78	74 (94.9)	55 (70.5)	19 (24.4)
	Primary outcome time point	76	86	78 (90.7)	28 (32.6)	50 (58.1)
	Secondary outcome(s)	75	53	51 (96.2)	16 (30.2)	35 (66.0)
	Secondary outcome time point	43	53	51 (96.2)	20 (37.7)	31 (58.5)
Summary measure	Median (IQR), n	57 (54)	59 (33)	53 (47 to 74)	27 (15 to 41)	18 (4 to 34)
Summary measure	Median (IQR), %	-	-	90.4 (86.2 to 96.9)	52.4 (32.8 to 73.2)	28.0 (14.7 to 57.9)

^aExaCT does not provide fragments for publication information. Data are shown for the remaining 18 data elements. Values in italics typeface fall at or below the limit of the lowest quartile
^bAcross all 75 trials, the number of relevant sentences among the 5 sentences reported within the solution for each data element
^cContained within sentences considered to be relevant by the human reviewers (column 3)
^dRelevant fragments of those contained within sentences considered to be relevant by the human reviewers (denominator, column 4)

ISSN: 1471-2288