Skip to main content

Table 2 Synthetic dataset characteristics

From: Estimating parameters for probabilistic linkage of privacy-preserved datasets

Field

0% Error

1% Error

5% Error

10% Error

20% Error

Unique Values

Discriminating Power

Unique Values

Discriminating Power

Unique Values

Discriminating Power

Unique Values

Discriminating Power

Unique Values

Discriminating Power

First Name

31,183

8.91

34,595

8.92

45,914

8.99

58,046

9.08

78,256

9.29

Middle Name

25,002

7.33

28,224

7.35

38,285

7.45

48,973

7.59

67,160

7.95

Last Name

56,507

10.87

61,198

10.88

77,088

10.96

94,925

11.07

125,483

11.35

Dob Year

112

6.49

114

6.49

116

6.50

117

6.51

119

6.53

Dob Month

12

3.58

12

3.58

12

3.58

12

3.58

12

3.58

Dob Day

31

4.94

31

4.94

31

4.94

31

4.94

31

4.93

Sex

2

1.00

2

1.00

2

1.00

2

1.00

2

1.00

Address

171,088

12.89

178,583

12.92

207,909

13.04

241,966

13.21

304,353

13.66

Suburb

1962

8.33

7390

8.36

19,664

8.48

31,054

8.65

49,929

9.10

Postcode

379

6.77

1755

6.80

2579

6.91

2981

7.06

3395

7.45