convert_biodata {tcgaViz} | R Documentation |
Format biological data
Description
Merges gene and cell datasets with the same TCGA sample identifiers, splits samples according to the expression levels of a selected gene into two categories (below or above average) and formats into a 3-column data frame: gene expression levels, cell types, and gene expression values.
Usage
convert_biodata(
genes,
cells,
select = colnames(genes)[3],
stat = "mean",
disease = NULL,
tissue = NULL
)
Arguments
genes |
data frame whose first two columns contain identifiers and the others float values. |
cells |
data frame whose first two columns contain identifiers and the others float values. |
select |
character for a column name in genes. |
stat |
character for the statistic to be chosen among "mean", "median" or "quantile". |
disease |
character for the type of TCGA cancer (see the list in extdata/disease_names.csv). |
tissue |
character for the type of TCGA tissue among : 'Additional - New Primary', 'Additional Metastatic', 'Metastatic', 'Primary Blood Derived Cancer - Peripheral Blood', 'Primary Tumor', 'Recurrent Tumor', 'Solid Tissue Normal' |
Details
disease
and tissue
arguments should be displayed in the title
of plot.biodata()
only if the genes
argument does not already have
them in its attributes.
Value
data frame with the following columns:
-
high
(logical): the expression levels of a selected gene, TRUE for below or FALSE for above average. -
cells
(factor): cell types. -
value
(float): the abundance estimation of the cell types.
Examples
data(tcga)
(df_formatted <- convert_biodata(tcga$genes, tcga$cells$Cibersort, "ICOS"))