deconvSpatialDDLS {SpatialDDLS} | R Documentation |
Deconvolute spatial transcriptomics data using trained model
Description
Deconvolute spatial transcriptomics data using the trained model in
the SpatialDDLS
object. The trained model is used
to predict cell proportions of two mirrored transcriptional profiles:
'Intrinsic' profiles: transcriptional profiles of each spot in the ST dataset.
'Extrinsic' profiles: profiles simulated from the surrounding spots of each spot.
After prediction, cell proportions from the intrinsic profiles (intrinsic cell proportions) are regularized based on the similarity between intrinsic and extrinsic profiles in order to maintain spatial consistency. This approach leverages both transcriptional and spatial information. For more details, see Mañanes et al., 2023 and the Details section.
Usage
deconvSpatialDDLS(
object,
index.st,
normalize = TRUE,
scaling = "standardize",
k.spots = 4,
pca.space = TRUE,
fast.pca = TRUE,
pcs.num = 50,
pca.var = 0.8,
metric = "euclidean",
alpha.cutoff = "mean",
alpha.quantile = 0.5,
simplify.set = NULL,
simplify.majority = NULL,
use.generator = FALSE,
batch.size = 64,
verbose = TRUE
)
Arguments
object |
|
index.st |
Name or index of the dataset/slide stored in the
|
normalize |
Normalize data (logCPM) before deconvolution ( |
scaling |
How to scale data before training. Options include
|
k.spots |
Number of nearest spots considered for each spot during regularization and simulation of extrinsic transcriptional profiles. The greater, the smoother the regularization will be (4 by default). |
pca.space |
Whether to use PCA space to calculate distances between
intrinsic and extrinsic transcriptional profiles ( |
fast.pca |
Whether using the irlba implementation. If |
pcs.num |
Number of PCs used to calculate distances if
|
pca.var |
Threshold of explained
variance (between 0.2 and 1) used to choose the number of PCs used if
|
metric |
Metric used to measure distance/similarity between intrinsic
and extrinsic transcriptional profiles. It may be |
alpha.cutoff |
Minimum distance for regularization.
It may be |
alpha.quantile |
Quantile used if |
simplify.set |
List specifying which cell types should be compressed
into a new label with the name of the list item. See examples for details.
If provided, results are stored in a list with |
simplify.majority |
List specifying which cell types should be
compressed into the cell type with the highest proportion in each spot.
Unlike |
use.generator |
Boolean indicating whether to use generators for
prediction ( |
batch.size |
Number of samples per batch. Only when |
verbose |
Show informative messages during the execution. |
Details
The deconvolution process involves two main steps: predicting cell proportions based on transcriptome using the trained neural network model, and regularization of cell proportions based on the spatial location of each spot. In the regularization step, a mirrored version of each spot is simulated based on its N-nearest spots. We refer to these profiles as 'extrinsic' profiles, whereas the transcriptional profiles of each spot are called 'intrinsic' profiles. Extrinsic profiles are used to regularize predictions based on intrinsic profiles. The rationale is that spots surrounded by transcriptionally similar spots should have similar cell compositions, and therefore predicted proportions can be smoothed to preserve their spatial consistency. On the other hand, spots surrounded by dissimilar spots cannot be predicted by their neighbors, and thus they can only be predicted by their own transcriptional profiles likely due to presenting very specific cell compositions.
Regarding the working os SpatialDDLS: first, extrinsic profiles are
simulated based on the N-nearest spots for each spot by summing their
transcriptomes. Distances between extrinsic and intrinsic profiles of each
spot are calculated so that similar/dissimilar spots are identified. These
two sets of transcriptional profiles are used as input for the trained neural
network model, and according to the calculated distances, a weighted mean
between the predicted proportions for each spot is calculated. Spots with
distances between intrinsic and extrinsic profiles greater than
alpha.cutoff
are not regularized, whereas spots with distances less
than alpha.cutoff
contribute to the weighted mean. Weights are
calculated by rescaling distances less than alpha.cutoff
between 0
and 0.5, so that the maximum extent to which a extrinsic profile can
modified the predictions based on intrinsic profiles is 0.5 (a regular
mean). For more details, see Mañanes et al., 2023.
This function requires a SpatialDDLS
object with a
trained deep neural network model (trained.model
slot, and the
spatial transcriptomics datasets to be deconvoluted in the
spatial.experiments
slot. See ?createSpatialDDLSobject
or ?loadSTProfiles
for more details.
Value
SpatialDDLS
object with a deconv.spots
slot. The output is a list containing 'Regularized', 'Intrinsic' and
'Extrinsic' deconvoluted cell proportions, 'Distances' between intrinsic
and extrinsic transcriptional profiles, and 'Weight.factors' with the
final weights used to regularize intrinsic cell proportions. If
simplify.set
and/or simplify.majority
are provided,
the deconv.spots
slot will contain a list with raw and simplified
results.
References
Mañanes, D., Rivero-García, I., Jimenez-Carretero, D., Torres, M., Sancho, D., Torroja, C., Sánchez-Cabo, F. (2023). SpatialDDLS: An R package to deconvolute spatial transcriptomics data using neural networks. biorxiv. doi: doi:10.1101/2023.08.31.555677.
See Also
Examples
set.seed(123)
sce <- SingleCellExperiment::SingleCellExperiment(
assays = list(
counts = matrix(
rpois(30, lambda = 5), nrow = 15, ncol = 20,
dimnames = list(paste0("Gene", seq(15)), paste0("RHC", seq(20)))
)
),
colData = data.frame(
Cell_ID = paste0("RHC", seq(20)),
Cell_Type = sample(x = paste0("CellType", seq(6)), size = 20,
replace = TRUE)
),
rowData = data.frame(
Gene_ID = paste0("Gene", seq(15))
)
)
SDDLS <- createSpatialDDLSobject(
sc.data = sce,
sc.cell.ID.column = "Cell_ID",
sc.gene.ID.column = "Gene_ID",
sc.filt.genes.cluster = FALSE
)
SDDLS <- genMixedCellProp(
object = SDDLS,
cell.ID.column = "Cell_ID",
cell.type.column = "Cell_Type",
num.sim.spots = 50,
train.freq.cells = 2/3,
train.freq.spots = 2/3,
verbose = TRUE
)
SDDLS <- simMixedProfiles(SDDLS)
# training of SDDLS model
SDDLS <- trainDeconvModel(
object = SDDLS,
batch.size = 15,
num.epochs = 5
)
# simulating spatial data
ngenes <- sample(3:40, size = 1)
ncells <- sample(10:40, size = 1)
counts <- matrix(
rpois(ngenes * ncells, lambda = 5), ncol = ncells,
dimnames = list(paste0("Gene", seq(ngenes)), paste0("Spot", seq(ncells)))
)
coordinates <- matrix(
rep(c(1, 2), ncells), ncol = 2
)
st <- SpatialExperiment::SpatialExperiment(
assays = list(counts = as.matrix(counts)),
rowData = data.frame(Gene_ID = paste0("Gene", seq(ngenes))),
colData = data.frame(Cell_ID = paste0("Spot", seq(ncells))),
spatialCoords = coordinates
)
SDDLS <- loadSTProfiles(
object = SDDLS,
st.data = st,
st.spot.ID.column = "Cell_ID",
st.gene.ID.column = "Gene_ID"
)
# simplify arguments
simplify <- list(CellGroup1 = c("CellType1", "CellType2", "CellType4"),
CellGroup2 = c("CellType3", "CellType5"))
SDDLS <- deconvSpatialDDLS(
object = SDDLS,
index.st = 1,
simplify.set = simplify,
simplify.majority = simplify
)