Annotated copy-number regions from the GEO GSE13372 data set.


The GEO GSE13372 data set is from the Affymetrix GenomeWideSNP_6 chip type. We have extracted one tumor/normal pair corresponding to the breast cancer cell line HCC1143. For consistency with the other data sets in the package the tumor and normal samples are labeled according to their tumor cellularity, that is, 100


A data frame with 205842 observations of 7 variables:


total copy number (not log-scaled)


allelic ratios in the diluted tumor sample (after TumorBoost)


germline genotypes


allelic ratios in the diluted tumor sample (before TumorBoost)


allelic ratios in the matched normal sample


a character value, annotation label for the region. Should be encoded as "(C1,C2)", where C1 denotes the minor copy number and C2 denotes the major copy number. For example,




Hemizygous deletion


Homozygous deletion


Single copy gain


Copy-neutral LOH


Balanced two-copy gain


Unbalanced two-copy gain


Single-copy gain with LOH


the (germline) genotype of SNPs. By definition, rows with missing genotypes are interpreted as non-polymorphic loci (a.k.a. copy number probes).


A numeric value between 0 and 1, the percentage of tumor cells in the sample.


These data have been processed from the files available from GEO using scripts that are included in the 'inst/preprocessing/GSE13372' directory of this package. This processing includes normalization of the raw CEL files using the CRMAv2 method implemented in the aroma.affymetrix package.



dat <- loadCnRegionData("GSE13372_HCC1143")

