trainDeconvModel {SpatialDDLS} | R Documentation |
Train deconvolution model for spatial transcriptomics data
Description
Train a deep neural network model using training data from the
SpatialDDLS
object. This model will be used to
deconvolute spatial transcriptomics data from the same biological context as
the single-cell RNA-seq data used to train it. In addition, the trained
model is evaluated using test data, and prediction results are obtained to
determine its performance (see ?calculateEvalMetrics
).
Usage
trainDeconvModel(
object,
type.data.train = "mixed",
type.data.test = "mixed",
batch.size = 64,
num.epochs = 60,
num.hidden.layers = 2,
num.units = c(200, 200),
activation.fun = "relu",
dropout.rate = 0.25,
loss = "kullback_leibler_divergence",
metrics = c("accuracy", "mean_absolute_error", "categorical_accuracy"),
normalize = TRUE,
scaling = "standardize",
norm.batch.layers = TRUE,
custom.model = NULL,
shuffle = TRUE,
sc.downsampling = NULL,
use.generator = FALSE,
on.the.fly = FALSE,
agg.function = "AddRawCount",
threads = 1,
view.metrics.plot = TRUE,
verbose = TRUE
)
Arguments
object |
|
type.data.train |
Type of profiles to be used for training. It can be
|
type.data.test |
Type of profiles to be used for evaluation. It can be
|
batch.size |
Number of samples per gradient update (64 by default). |
num.epochs |
Number of epochs to train the model (60 by default). |
Number of hidden layers of the neural network (2 by
default). This number must be equal to the length of | |
num.units |
Vector indicating the number of neurons per hidden layer
( |
activation.fun |
Activation function ( |
dropout.rate |
Float between 0 and 1 indicating the fraction of input neurons to be dropped in layer dropouts (0.25 by default). By default, SpatialDDLS implements 1 dropout layer per hidden layer. |
loss |
Character indicating loss function selected for model training
( |
metrics |
Vector of metrics used to assess model performance during
training and evaluation ( |
normalize |
Whether to normalize data using logCPM ( |
scaling |
How to scale data before training. It can be:
|
norm.batch.layers |
Whether to include batch normalization layers
between each hidden dense layer ( |
custom.model |
It allows to use a custom neural network architecture. It
must be a |
shuffle |
Boolean indicating whether data will be shuffled ( |
sc.downsampling |
It is only used if |
use.generator |
Boolean indicating whether to use generators during
training and test. Generators are automatically used when |
on.the.fly |
Boolean indicating whether simulated data will be generated
'on the fly' during training ( |
agg.function |
If
|
threads |
Number of threads used during simulation of mixed
transcriptional profiles if |
view.metrics.plot |
Boolean indicating whether to show plots of loss and
evaluation metrics during training ( |
verbose |
Boolean indicating whether to display model progression during
training and model architecture information ( |
Details
Simulation of mixed transcriptional profiles 'on the fly'
trainDeconvModel
can avoid storing simulated mixed spot profiles by
using the on.the.fly
argument. This functionality aims at reducing the
the simMixedProfiles
function's memory usage: simulated profiles are
built in each batch during training/evaluation.
Neural network architecture
It is possible to change the model's architecture: number of hidden layers,
number of neurons for each hidden layer, dropout rate, activation function,
and loss function. For more customized models, it is possible to provide a
pre-built model through the custom.model
argument (a
keras.engine.sequential.Sequential
object) where it is necessary that
the number of input neurons is equal to the number of considered
features/genes, and the number of output neurons is equal to the number of
considered cell types.
Value
A SpatialDDLS
object with trained.model
slot containing a DeconvDLModel
object. For more
information about the structure of this class, see
?DeconvDLModel
.
See Also
plotTrainingHistory
deconvSpatialDDLS
Examples
set.seed(123)
sce <- SingleCellExperiment::SingleCellExperiment(
assays = list(
counts = matrix(
rpois(30, lambda = 5), nrow = 15, ncol = 10,
dimnames = list(paste0("Gene", seq(15)), paste0("RHC", seq(10)))
)
),
colData = data.frame(
Cell_ID = paste0("RHC", seq(10)),
Cell_Type = sample(x = paste0("CellType", seq(2)), size = 10,
replace = TRUE)
),
rowData = data.frame(
Gene_ID = paste0("Gene", seq(15))
)
)
SDDLS <- createSpatialDDLSobject(
sc.data = sce,
sc.cell.ID.column = "Cell_ID",
sc.gene.ID.column = "Gene_ID",
sc.filt.genes.cluster = FALSE
)
SDDLS <- genMixedCellProp(
object = SDDLS,
cell.ID.column = "Cell_ID",
cell.type.column = "Cell_Type",
num.sim.spots = 50,
train.freq.cells = 2/3,
train.freq.spots = 2/3,
verbose = TRUE
)
SDDLS <- simMixedProfiles(SDDLS)
SDDLS <- trainDeconvModel(
object = SDDLS,
batch.size = 12,
num.epochs = 5
)