Skip to main content

Table 3 Comparison of survival time and diagnosis date matching when comparing a rare real-world individual to 1 million synthetic individuals with matched covariate profiles

From: Generating high-fidelity synthetic time-to-event datasets to improve data transparency and accessibility

 

Real-World Individual Covariate Patterns

No of Observations

1

3

3

2

1

1

Age at Diagnosis

68

73

81

70

60

45

Sex

Female

Male

Female

Female

Male

Male

Stage at Diagnosis

Localised

Localised

Distant

Regional

Distant

Localised

Year of Diagnosis

1994

1994

1992

1990

1993

1989

Anatomical Subsite

Other

Sigmoid

Coecum

Coecum

Transverse

Sigmoid

Vital Status

Alive

Alive

Alive

Alive

Dead

Dead

Dead

Dead

Alive

Dead

Alive

Synthetic Vital Status (Alive/Dead%)

94.74/5.25

89.36/10.64

9.92/90.08

44.07/55.93

17.22/82.78

78.44/21.56

Diagnosis Date

15/9/1994

14/1/1994

15/10/1994

15/9/1994

21/12/1992

17/3/1992

15/4/1992

20/11/1990

16/5/1990

7/12/1993

14/11/1989

Synthetic Observations with Diagnosis Date ± 15 Days (%)

8.65

7.75

8.32

8.57

7.01

8.36

8.68

8.56

8.37

8.42

8.56

Survival Time (Rounded to Nearest Day)

472

716

442

472

16

46

868

351

2055

77

2238

Synthetic Observations with Survival Time ± 15 Days (%)

8.43

6.74

4.10

8.10

11.61

15.77

0.45

1.61

3.77

7.26

7.01

Survival Time Matches per 1 Million Synthetic Patients

2719

2282

2573

2633

4102

2187

156

2740

2670

1011

2225