Skip to main content

Table 1 Acute myeloid leukemia: REMARK-like profile of the analysis performed on the dataset

From: Added predictive value of omics data: specific issues related to validation illustrated by two case studies

a) Patients, treatment and variables

Study and marker

Remarks

Marker

OS = 86-probe-set gene-expression signature

Further variables

v1 = age, v2 = sex, v3 = NMP1, v4 = FLT3-ITD

Reference

Metzeler et al. (2008)

Source of the data

GEO (reference: GSE12417)

Patients

n

Remarks

 

Training set

Assessed for eligibility

163

Disease: acute myeloid leukemia

  

Patient source: German AML Cooperative Group 1999-2003

Excluded

0

  

Included

163

Treatment: following AMLCG-1999 trial

  

Gene expression profiling: Affymetrix HG-U133 A&B microarrays

With outcome events

105

Overall survival: death from any cause

Validation set

Assessed for eligibility

79

Disease: acute myeloid leukemia

  

Patient source: German AML Cooperative Group 2004

Excluded

0

  

Included

79

Treatment: 62 following AMLCG-1999 trial 17 intensive chemotherapy outside the study

  

Gene expression profiling: Affymetrix HG-U133 plus 2.0 microarrays

With outcome events

33

Overall survival: death from any cause

Relevant differences between training and validation sets

Data source

Same research group, different time (see above)

Follow-up time

Much shorter in the validation set (see text)

Survival rate

Higher in the validation set (see Figure 2)

b) Statistical analyses of survival outcomes

Analysis

n

e

Variables considered

Results/remarks

A: preliminary analysis (separately on training and validation sets)

A1: univariate

163

105

v1 to v4

Kaplan-Meier curves (Figure 1)

79

33

B: evaluating clinical model and combined model on validation data (models fitted on training set, evaluated on validation set)

B1: overall prediction

   

Prediction error curves (Figure 5)

   

Integrated Brier score (text)

 

Training

  

Comparison of Kaplan-Meier curves for risk groups:

 

163

105

 

- Medians as cutpoints (Figure 6),

B2: discriminative ability

  

OS, v1 to v4

- K-mean clustering (data not shown - see text)

   

C-index (text)

 

Validation

  

K-statistic (text)

B3: calibration

79

33

 

Kaplan-Meier curve vs average individual survival curves for risk groups (Figure 7)

   

Calibration slope (text)

C: Multivariate testing of the omics score in the validation data (only validation set involved)

C1: significance

79

33

OS, v1 to v4

Multivariate Cox model (Table 3)

D: Comparison of the predictive accuracy of clinical and combined models through cross-validation in the validation data (only validation set involved)

D1: overall prediction

79

33

OS, v1 to v4

Prediction error curves based on repeated cross-validation (Figure 8)

Prediction error curves based on repeated subsampling (data not shown - see text)

Prediction error curves based on repeated bootstrap resampling (data not shown - see text)

Integrated Brier score based on cross-validation (text)

E: Subgroup analysis (E1-E3 based on training and validation sets, E4 and E5 only on validation set; for all, separate analysis for female and male population)

E1: overall prediction

Female

OS, v1 to v4

Prediction error curves (Figure 9)

E2: discriminative ability

t.: 88 54

C-index (text)

 

v.: 46 16

K-statistic (text)

E3: calibration

Male

Calibration slope (text)

E4: significance

t.: 74 51

Multivariate Cox model (text)

E5: overall prediction

v.: 33 17

Prediction error curves based on cross-validation (Figure 10)