Skip to main content

Table 3 Number of levels in each categorical variable in the feature set large-set

From: Generation and evaluation of synthetic patient data

 

BREAST

LYMYLEUK

RESPIR

AGE_DX

11

12

12

BEHO3V

2

1

2

CS15SITE

7

-

-

CS1SITE

7

12

100

CS2SITE

7

6

8

CS3SITE

56

13

10

CS4SITE

5

-

10

CS5SITE

4

-

8

CS6SITE

8

-

9

CS7SITE

12

-

-

CSEXTEN

37

39

91

CSLYMPHN

36

21

23

CSMETSDX

9

12

31

CSMETSDXBR_PUB

4

4

4

CSMETSDXB_PUB

4

4

4

CSMETSDXLIV_PUB

4

4

4

CSMETSDXLUNG_PUB

4

4

4

CSMTEVAL

8

4

8

CSRGEVAL

8

6

8

CSTSEVAL

8

8

8

CSVCURRENT

5

5

5

CSVFIRST

10

11

10

DX_CONF

9

9

8

GRADE

5

5

5

HISTO3V

102

100

169

LATERAL

5

7

6

MAR_STAT

7

7

7

NHIADE

9

9

9

NO_SURG

8

8

8

PRIMSITE

9

257

29

RACE1V

29

28

30

REC_NO

7

11

9

REG

9

9

9

REPT_SRC

8

8

8

SEQ_NUM

10

14

11

SEX

2

2

2

SURGSITF

7

7

7

TYPE_FU

2

2

2

YEAR_DX

6

6

6

YR_BRTH

96

111

112

  1. Dash ’-’ indicates that the variable (row) was not considered for that dataset (column) as the variable is not existent for that cancer type