Golub {mpm} | R Documentation |
Golub (1999) Data
Description
Golub et al. (1999) data on gene expression profiles of 38 patients suffering from acute leukemia and a validation sample of 34 patients.
Format
The expression data are available in data frame Golub
with
5327 observations on the following 73 variables.
- list("Gene")
a character vector with gene identifiers
- list("1")
gene expression data for sample 1
- list("2")
gene expression data for sample 2
- list("3")
gene expression data for sample 3
- list("4")
gene expression data for sample 4
- list("5")
gene expression data for sample 5
- list("6")
gene expression data for sample 6
- list("7")
gene expression data for sample 7
- list("8")
gene expression data for sample 8
- list("9")
gene expression data for sample 9
- list("10")
gene expression data for sample 10
- list("11")
gene expression data for sample 11
- list("12")
gene expression data for sample 12
- list("13")
gene expression data for sample 13
- list("14")
gene expression data for sample 14
- list("15")
gene expression data for sample 15
- list("16")
gene expression data for sample 16
- list("17")
gene expression data for sample 17
- list("18")
gene expression data for sample 18
- list("19")
gene expression data for sample 19
- list("20")
gene expression data for sample 20
- list("21")
gene expression data for sample 21
- list("22")
gene expression data for sample 22
- list("23")
gene expression data for sample 23
- list("24")
gene expression data for sample 24
- list("25")
gene expression data for sample 25
- list("26")
gene expression data for sample 26
- list("27")
gene expression data for sample 27
- list("34")
gene expression data for sample 34
- list("35")
gene expression data for sample 35
- list("36")
gene expression data for sample 36
- list("37")
gene expression data for sample 37
- list("38")
gene expression data for sample 38
- list("28")
gene expression data for sample 28
- list("29")
gene expression data for sample 29
- list("30")
gene expression data for sample 30
- list("31")
gene expression data for sample 31
- list("32")
gene expression data for sample 32
- list("33")
gene expression data for sample 33
- list("39")
gene expression data for sample 39
- list("40")
gene expression data for sample 40
- list("42")
gene expression data for sample 42
- list("47")
gene expression data for sample 47
- list("48")
gene expression data for sample 48
- list("49")
gene expression data for sample 49
- list("41")
gene expression data for sample 41
- list("43")
gene expression data for sample 43
- list("44")
gene expression data for sample 44
- list("45")
gene expression data for sample 45
- list("46")
gene expression data for sample 46
- list("70")
gene expression data for sample 70
- list("71")
gene expression data for sample 71
- list("72")
gene expression data for sample 72
- list("68")
gene expression data for sample 68
- list("69")
gene expression data for sample 69
- list("67")
gene expression data for sample 67
- list("55")
gene expression data for sample 55
- list("56")
gene expression data for sample 56
- list("59")
gene expression data for sample 59
- list("52")
gene expression data for sample 52
- list("53")
gene expression data for sample 53
- list("51")
gene expression data for sample 51
- list("50")
gene expression data for sample 50
- list("54")
gene expression data for sample 54
- list("57")
gene expression data for sample 57
- list("58")
gene expression data for sample 58
- list("60")
gene expression data for sample 60
- list("61")
gene expression data for sample 61
- list("65")
gene expression data for sample 65
- list("66")
gene expression data for sample 66
- list("63")
gene expression data for sample 63
- list("64")
gene expression data for sample 64
- list("62")
gene expression data for sample 62
The classes are in a separate numeric vector Golub.grp
with values
1
for the 38 ALL B-Cell samples, 2
for the 9 ALL T-Cell
samples and 3
for the 25 AML samples.
Details
The original data of Golub et al. (1999) were preprocessed as follows: genes that were called 'absent' in all samples were removed from the data sets, since these measurements are considered unreliable by the manufacturer of the technology. Negative measurements in the data were set to 1.
The resulting data frame contains 5327 genes of the 6817 originally reported by Golub et al. (1999).
Note
Luc Wouters et al. (2003), p. 1134 contains a typo concerning the sample sizes of AML- and ALL-type and erroneously reported
Source
Golub, T. R., Slonim, D. K., Tamayo, P., et al. (1999). Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science 286, 531 – 537.
References
Luc Wouters et al. (2003). Graphical Exploration of Gene Expression Data: A Comparative Study of Three Multivariate Methods, Biometrics, 59, 1131-1139.