Skip to main content

Advertisement

Table 1 Overview of the datasets used in the studies on normalization and PCA. The following information is given: accession number, number of observations, number of variables, proportion of observations in the smaller class, data type

From: A measure of the impact of CV incompleteness on prediction error estimation with application to PCA and normalization

Study Label/ Num. of Num. of Prop. smaller Data type ID
  acc. number observ. variables class   
Normalization E-GEOD-10320 100 22283 0.42 transcription 1
Normalization E-GEOD-47552 74 32321 0.45 transcription 2
Normalization E-GEOD-25639 57 54675 0.46 transcription 3
Normalization E-GEOD-29044 54 54675 0.41 transcription 4
Normalization E-MTAB-57 47 22283 0.47 transcription 5
Normalization E-GEOD-19722 46 54675 0.39 transcription 6
Normalization E-MEXP-3756 40 54675 0.50 transcription 7
Normalization E-GEOD-34465 26 32321 0.35 transcription 8
Normalization E-GEOD-30174 20 54675 0.50 transcription 9
Normalization E-GEOD-39683 20 32321 0.40 transcription 10
Normalization E-GEOD-40744 20 20706 0.50 transcription 11
Normalization E-GEOD-46053 20 54675 0.40 transcription 12
PCA E-GEOD-37582 121 48766 0.39 transcription 13
PCA ProstatecTranscr 102 12625 0.49 transcription 14
PCA GSE20189 100 22277 0.49 transcription 15
PCA E-GEOD-57285 77 27578 0.45 DNA methyl. 16
PCA E-GEOD-48153 71 23232 0.48 proteomic 17
PCA E-GEOD-42826 68 47323 0.24 transcription 18
PCA E-GEOD-31629 62 13737 0.35 transcription 19
PCA E-GEOD-33615 60 45015 0.35 transcription 20
PCA E-GEOD-39046 57 392 0.47 transcription 21
PCA E-GEOD-32393 56 27578 0.41 DNA methyl. 22
PCA E-GEOD-42830 55 47323 0.31 transcription 23
PCA E-GEOD-39345 52 22184 0.38 transcription 24
PCA GSE33205 50 22011 0.50 transcription 25
PCA E-GEOD-36769 50 54675 0.28 transcription 26
PCA E-GEOD-43329 48 887 0.40 transcription 27
PCA E-GEOD-42042 47 27578 0.49 DNA methyl. 28
PCA E-GEOD-25609 41 1145 0.49 transcription 29
PCA GSE37356 36 47231 0.44 transcription 30
PCA E-GEOD-49641 36 33297 0.50 transcription 31
PCA E-GEOD-37965 30 485563 0.50 DNA methyl. 32
  1. ArrayExpress accession numbers have the prefix E-GEOD-, NCBI GEO accession numbers have the prefix GSE