deconvDigitalDLSorter {digitalDLSorteR} | R Documentation |
Deconvolute bulk RNA-Seq samples using a pre-trained DigitalDLSorter model
Description
Deconvolute bulk gene expression samples (bulk RNA-Seq) to enumerate and
quantify the proportion of cell types present in a bulk sample using Deep
Neural Network models. This function is intended for users who want to use
pre-trained models integrated in the package. So far, the available models
allow to deconvolute the immune infiltration of breast cancer (using data from
Chung et al., 2017) and the immune infiltration of colorectal cancer (using
data from Li et al., 2017) samples. For the former, two models are available
at two different levels of specificity: specific cell types
(breast.chung.specific
) and generic cell types
(breast.chung.generic
). See breast.chung.generic
,
breast.chung.specific
, and colorectal.li
documentation from the
digitalDLSorteRdata package for more details.
Usage
deconvDigitalDLSorter(
data,
model = NULL,
normalize = TRUE,
scaling = "standardize",
simplify.set = NULL,
simplify.majority = NULL,
use.generator = FALSE,
batch.size = 64,
verbose = TRUE
)
Arguments
data |
Matrix or data frame with bulk RNA-Seq samples with genes as rows in SYMBOL notation and samples as columns. |
model |
Pre-trained DNN model to use to deconvolute |
normalize |
Normalize data before deconvolution ( |
scaling |
How to scale data before training. It may be:
|
simplify.set |
List specifying which cell types should be compressed into a new label whose name will be the list name item. See examples and vignettes for details. |
simplify.majority |
List specifying which cell types should be compressed
into the cell type with the highest proportion in each sample. Unlike
|
use.generator |
Boolean indicating whether to use generators for
prediction ( |
batch.size |
Number of samples per batch. Only when |
verbose |
Show informative messages during execution. |
Details
This function is intended for users who want to use digitalDLSorteR to
deconvolute their bulk RNA-Seq samples using pre-trained models. For users who
want to build their own models from other scRNA-Seq datasets, see the
createDDLSobject
and deconvDDLSObj
functions.
Value
A data frame with samples (i
) as rows and cell types (j
)
as columns. Each entry represents the predicted proportion of cell type
j
in sample i
.
References
Chung, W., Eum, H. H., Lee, H. O., Lee, K. M., Lee, H. B., Kim, K. T., et al. (2017). Single-cell RNA-seq enables comprehensive tumour and immune cell profiling in primary breast cancer. Nat. Commun. 8 (1), 15081. doi: doi:10.1038/ncomms15081.
See Also
Examples
## Not run:
set.seed(123)
sce <- SingleCellExperiment::SingleCellExperiment(
assays = list(
counts = matrix(
rpois(30, lambda = 5), nrow = 15, ncol = 20,
dimnames = list(paste0("Gene", seq(15)), paste0("RHC", seq(20)))
)
),
colData = data.frame(
Cell_ID = paste0("RHC", seq(20)),
Cell_Type = sample(x = paste0("CellType", seq(6)), size = 20,
replace = TRUE)
),
rowData = data.frame(
Gene_ID = paste0("Gene", seq(15))
)
)
DDLS <- createDDLSobject(
sc.data = sce,
sc.cell.ID.column = "Cell_ID",
sc.gene.ID.column = "Gene_ID",
sc.filt.genes.cluster = FALSE,
sc.log.FC = FALSE
)
probMatrixValid <- data.frame(
Cell_Type = paste0("CellType", seq(6)),
from = c(1, 1, 1, 15, 15, 30),
to = c(15, 15, 30, 50, 50, 70)
)
DDLS <- generateBulkCellMatrix(
object = DDLS,
cell.ID.column = "Cell_ID",
cell.type.column = "Cell_Type",
prob.design = probMatrixValid,
num.bulk.samples = 50,
verbose = TRUE
)
# training of DDLS model
tensorflow::tf$compat$v1$disable_eager_execution()
DDLS <- trainDDLSModel(
object = DDLS,
on.the.fly = TRUE,
batch.size = 15,
num.epochs = 5
)
# simulating bulk RNA-Seq data
countsBulk <- matrix(
stats::rpois(100, lambda = sample(seq(4, 10), size = 100, replace = TRUE)),
nrow = 40, ncol = 15,
dimnames = list(paste0("Gene", seq(40)), paste0("Bulk", seq(15)))
)
# this is only an example. See vignettes to see how to use pre-trained models
# from the digitalDLSorteRmodels data package
results1 <- deconvDigitalDLSorter(
data = countsBulk,
model = trained.model(DDLS),
normalize = TRUE
)
# simplify arguments
simplify <- list(CellGroup1 = c("CellType1", "CellType2", "CellType4"),
CellGroup2 = c("CellType3", "CellType5"))
# in this case the names of the list will be the new labels
results2 <- deconvDigitalDLSorter(
countsBulk,
model = trained.model(DDLS),
normalize = TRUE,
simplify.set = simplify
)
# in this case the cell type with the highest proportion will be the new label
results3 <- deconvDigitalDLSorter(
countsBulk,
model = trained.model(DDLS),
normalize = TRUE,
simplify.majority = simplify
)
## End(Not run)