Generation and evaluation of synthetic patient data

Table 5 Average and std of data utility metrics computed on BREAST small-set

	PCD ↓	Log-Cluster ↓	CrCl-RS (1)	CrCl-SR (1)	Supp. Coverage ↑
Method
IM	0.73 (0.0)	-5.38 (0.87)	0.96 (0.0)	1.0 (0.0)	0.99 (0.01)
BN	0.28 (0.01)	-8.38 (1.12)	0.99 (0.0)	1.0 (0.0)	0.98 (0.01)
MPoM	0.03 (0.01)	-10.5 (0.46)	1.0 (0.0)	1.0 (0.0)	1.0 (0.01)
CLGP	0.17 (0.01)	-7.8 (0.65)	0.99 (0.01)	1.0 (0.01)	1.0 (0.0)
MC-MedGAN	0.76 (0.01)	-3.17 (0.1)	1.0 (0.01)	0.75 (0.0)	0.95 (0.01)
MICE-LR	0.07 (0.01)	-8.34 (0.29)	0.99 (0.0)	1.0 (0.0)	1.0 (0.0)
MICE-LR-DESC	0.06 (0.01)	-9.36 (0.49)	0.99 (0.0)	1.0 (0.0)	1.0 (0.01)
MICE-DT	0.02 (0.0)	-11.61 (0.3)	1.01 (0.0)	1.0 (0.0)	0.99 (0.01)

CrCl-RS and CrCl-SR are the cross-classification metric computed on real → synthetic (RS) and synthetic → real (SR), respectively. Metrics were computed from 10 synthetically generated datasets. The symbols on the right side of metric’s name indicate: ↑ the higher the better, ↓ the lower the better, and (1) the closer to one the better

ISSN: 1471-2288