Skip to main content

Table 3 Synthetic dataset linkage quality - estimated vs. calculated

From: Estimating parameters for probabilistic linkage of privacy-preserved datasets

Data Error Rate

Calculated Probabilities

EM m-probs and Estimated u-probs

Highest

Estimated

Highest

Estimated

Threshold

FMeasure

Threshold

FMeasure

Threshold

FMeasure

Threshold

FMeasure

0%

49

1.0000

8

0.9999

49

1.0000

8

0.9999

1%

9

0.9979

16

0.9978

13

0.9979

11

0.9979

5%

8

0.9549

16

0.9541

12

0.9549

11

0.9549

10%

8

0.8443

16

0.8399

12

0.8439

11

0.8436

20%

8

0.5217

16

0.4938

12

0.4999

11

0.4917