Skip to main content

Table 4 Administrative dataset characteristics

From: Estimating parameters for probabilistic linkage of privacy-preserved datasets

 

NSW(13,534,177 records)

SA(2,509,914 records)

WA(6,772,949 records)

Field

Unique Values

Missing %

Discriminating Power

Unique Values

Missing %

Discriminating Power

Unique Values

Missing %

Discriminating Power

First Name

168,766

2.9%

8.61

124,849

5.5%

9.18

78,992

0.3%

8.54

Middle Name

114,686

54.2%

6.96

22,180

75.4%

7.19

61,241

40.8%

7.13

Last Name

291,595

0%

10.92

81,431

5.3%

10.81

123,481

0%

10.73

Dob Year

123

0%

6.47

115

0%

6.45

118

0%

6.39

Dob Month

12

0%

3.58

12

0%

3.58

12

0%

3.58

Dob Day

31

0%

4.94

31

0%

4.94

31

0%

4.94

Sex

2

0%

1.00

2

0%

1.00

2

0%

0.99

Address

3,084,889

1.5%

16.96

690,615

8.1%

14.92

1,350,796

0.2%

16.05

Suburb

49,843

0.5%

9.30

10,729

6.9%

7.85

5542

0.1%

7.73

Postcode

3947

0.8%

8.17

2238

8.5%

6.90

2319

0.2%

6.58