Skip to main content

Table 3 Extract recommended PBS links and steps undertaken to check cross-jurisdictional linkage

From: Data cleaning and management protocols for linked perinatal research data: a good practice example from the Smoking MUMS (Maternal Use of Medications and Safety) Study

Step Data sets

Explanationa

Findings

 

Extract recommended PBS links

 

PBS PATID- mumPPN mapping tables

Extract the PBS links from the mapping tables where weight of the match is equal to or greater than the recommended threshold (≥29 for NSW and ≥28 for WA).

• 512,887 (50.9%) PBS links to NSW mumPPNs extracted.

• 113,085 (49.9%) PBS links to WA mumPPNs extracted.

17

Detection of clusters of mumPPNs

 

PBS mapping tables

17.1 Among the extracted PBS links, identify cases where a PATID matches to two or more mumPPNs. Send these cases to AIHW data linkage unit for review and obtain advice on reliability of the matches.

17.2 It was advised that there are cases wherein all matches are correct. Consider these cases as clusters of mumPPNs

17.3 For the remaining cases, advice was given on how to select the reliable matches and reject the others.

• 9190 cases (20,254 matches) detected and reviewed. Of those, there were 3404 mumPPN clusters (including 6819 matches to 6815 mumPPNs).

• 7649 PBS links were disregarded.

18

Date of pharmaceutical dispensing later than date of death

 

PBS mapping tables,

PBS claims,

Death

18.1 For any women who have a death record, identify the matched PATIDs (remaining at completion of Step 17);

18.2 Extract PBS claim records for those PATIDs and compare dates of pharmaceutical supply with the date of death. Disregard the PBS links if date of supply >date of death.

• 79 PBS links were disregarded.

19

Consistency in clusters of mumPPNs

 

PBS mapping tables,

Perinatal,

Hospital,

ED,

Death

For 3404 clusters (identified in Step 17):

 19.1 Create the variable ClusterID and assignto all mumPPNs in each cluster.

 19.2 Extract perinatal, hospital, ED and death records for these clusters. If one of the mumPPNs in the cluster has a death record or “exclusion” flag (results of Steps 1–16), assign the date of death and “exclusion” flag to the ClusterID.

 19.3 Apply Steps 6, 8, 11 and 15 (outlined in Table 2) to assess consistency within each ClusterID;

 19.4 Reject the cluster if it has “exclusion” flag or data inconsistencies. Further mark “exclusion” for all mumPPNs in the rejected cluster and disregard associated PBS links.

• 81 clusters were rejected, due to either

 - “exclusion” flag (n = 13);

 - negative pregnancy interval (n = 44);

 - different years of birth (n = 21);

 - inconsistent parity (n = 48);

 - service use after date of death (n = 0).

• 149 mumPPNs further marked as “exclusion”;

• 230 PBS links were further disregarded.

• 3323 clusters were accepted.

20

Consistency in women who had records from both States

 

PBS mapping tables,

Perinatal,

Hospital,

ED,

Death

20.8 Among the remaining PBS links (at completion of Step 19), identify PATIDs which concurrently match to NSW mumPPNs and WA mumPPNs;

20.9 Create variable CrossID and assign to all mumPPNs in the pairs.

20.10 Extract perinatal, hospital, ED and death records for these CrossID. If a mumPPN in the pair has a death record or an “exclusion” flag, assign the date of death and “exclusion” flag to the CrossID;

20.11 Apply Steps 6, 8, 11 and 15 (outlined in Table 2) to assess consistency within each CrossID.

20.12 Reject the CrossID if it has “exclusion” flag or data inconsistencies. Disregard the match to mumPPN in one State and accept the match to mumPPN in the other State. The decision about which State the match to be disregarded is made based on the weight of the match (lower weight) or the State of the “exclusion” flag (e.g. exclusion flag arising from NSW data cleaning then disregard the match to NSW mumPPN).

• 2855 PATIDs concurrently match to mumPPNs in both NSW and WA;

• 2645 CrossIDs were created, taking into account the network among PATIDs and mumPPNsb;

• 659 CrossIDs were rejected, due to either

 - “exclusion” flag (n = 12);

 - negative pregnancy interval (n = 327);

 - different years of birth (n = 444)

 - inconsistent parity (n = 273);

 - service use after date of death (n = 0);

• 802 PBS links were further disregarded;

• 1986 CrossIDs accepted as women having births in both States.

21

Integration of mothers’ records

 

All datasets relating to mothers

Integration of records is required for mumPPN clusters and women having records in both States.

 21.2 Generate a list of unique mumPPNs (as per State linkages) together with the “exclusion” flag (results of Steps 1–16 and 19). Merge in the accepted ClusterIDs and CrossIDs (Steps 19 and 20).

 21.3 Create variable finalPPNmum by collapsing 3 variables according to the following hierarchy: accepted CrossID, accepted ClusterID and mumPPN;

 21.4 Merge the variables finalPPNmum and “exclusion” flag into perinatal, hospital, ED, death data sets, and the mapping tables (at the completion of Step 20).

• As per State linkages, there were 783,471 mumPPNs.

• Based on the variable finalPPNmum, there were 778,154 women (including those flagged as “exclusion”).

22

Consistency in finalPPNmums with multiple PATIDs and finalise mother cohort

 

PBS mapping table,

PBS claims,

Perinatal

22.1 Combine the NSW and WA mapping tables (at completion of Step 21), Remove PBS links relating to “exclusion” women. Among the remaining links, identify finalPPNmums that match to ≥2 PATIDs. For those women:

 22.1.1 Examine the consistency of month and year of birth recorded in PBS claim records

 22.1.2 Examine the consistency of parity in perinatal data (Step 8).

22.2 Flag the women as “exclusion” if month and/or year of birth is inconsistent, or sequence of parity values is illogical. Further remove related PBS links.

22.3 Extract PBS claims records for the final cohort of mothers.

• 4601 finalPPNmums with multiple PATIDs. Of those, 2763 were further flagged as “exclusion”

• 7378 PBS links related to “exclusion” women were removed.

• The final cohort included 774,449 women, (excluding 3705 women flagged as “exclusion”);

• The final mapping table included 609,834 links,

• 14,212,785 claims records were extracted.

  1. aIn this study, adding a variable to a data set is referred to as “merge” while adding records is referred to as “combine”
  2. bFor example, a NSW mumPPN matches to 2 PATIDs and these 2 PATIDs match to three different WA mumPPNs