DR.SC_fit {DR.SC} | R Documentation |
Joint dimension reduction and spatial clustering
Description
Joint dimension reduction and spatial clustering for scRNA-seq and spatial transcriptomics data
Usage
DR.SC_fit(X, K, Adj_sp=NULL, q=15,
error.heter= TRUE, beta_grid=seq(0.5, 5, by=0.5),
maxIter=25, epsLogLik=1e-5, verbose=FALSE, maxIter_ICM=6,
wpca.int=FALSE, int.model="EEE", approxPCA=FALSE, coreNum = 5)
Arguments
X |
a sparse matrix with class |
K |
a positive integer allowing scalar or vector, specify the number of clusters in model fitting. |
Adj_sp |
an optional sparse matrix with class |
q |
a positive integer, specify the number of latent features to be extracted, default as 15. Usually, the choice of q is a trade-off between model complexity and fit to the data, and depends on the goals of the analysis and the structure of the data. A higher value will result in a more complex model with a higher number of parameters, which may lead to overfitting and poor generalization performance. On the other hand, a lower value will result in a simpler model with fewer parameters, but may also lead to underfitting and a poorer fit to the data. |
error.heter |
an optional logical value, whether use the heterogenous error for DR-SC model, default as |
beta_grid |
an optional vector of positive value, the candidate set of the smoothing parameter to be searched by the grid-search optimization approach. |
maxIter |
an optional positive value, represents the maximum iterations of EM. |
epsLogLik |
an optional positive vlaue, tolerance vlaue of relative variation rate of the observed pseudo log-loglikelihood value, defualt as '1e-5'. |
verbose |
an optional logical value, whether output the information of the ICM-EM algorithm. |
maxIter_ICM |
an optional positive value, represents the maximum iterations of ICM. |
wpca.int |
an optional logical value, means whether use the weighted PCA to obtain the initial values of loadings and other paramters, default as |
int.model |
an optional string, specify which Gaussian mixture model is used in evaluting the initial values for DR-SC, default as "EEE"; and see Mclust for more models' names. |
approxPCA |
an optional logical value, whether use approximated PCA to speed up the computation for initial values. |
coreNum |
an optional positive integer, means the number of thread used in parallel computating, default as 5. If the length of K is one, then coreNum will be set as 1 automatically. |
Details
Nothing
Value
DR.SC_fit returns a list with class "drscObject" with the following three components:
Objdrsc |
a list including the model fitting results, in which the number of elements is same as the length of K. |
out_param |
a numeric matrix used for model selection in MBIC. |
K_set |
a scalar or vector equal to input argument K. |
In addition, each element of "Objdrsc" is a list with the following comoponents:
cluster |
inferred class labels |
hZ |
extracted latent features. |
beta |
estimated smoothing parameter |
Mu |
mean vectors of mixtures components. |
Sigma |
covariance matrix of mixtures components. |
W |
estimated loading matrix |
Lam_vec |
estimated variance of errors in probabilistic PCA model |
loglik |
pseudo observed log-likelihood. |
Note
nothing
Author(s)
Wei Liu
References
See Also
None
Examples
## we generate the spatial transcriptomics data with lattice neighborhood, i.e. ST platform.
seu <- gendata_RNAExp(height=10, width=10,p=50, K=4)
library(Seurat)
seu <- NormalizeData(seu, verbose=FALSE)
# choose 40 highly variable features using FindVariableFeatures in Seurat
# seu <- FindVariableFeatures(seu, nfeatures = 40)
# or choose 40 spatailly variable features using FindSVGs in DR.SC
seu <- FindSVGs(seu, nfeatures = 40, verbose=FALSE)
# users define the adjacency matrix
Adj_sp <- getAdj(seu, platform = 'ST')
if(class(seu@assays$RNA)=="Assay5"){
var.features <- seu@assays$RNA@meta.data$var.features
var.features <- var.features[!is.na(var.features )]
dat <- GetAssayData(seu, assay = "RNA", slot='data')
X <- Matrix::t(dat[var.features,])
}else{
var.features <- seu@assays$RNA@var.features
X <- Matrix::t(seu[["RNA"]]@data[var.features,])
}
# maxIter = 2 is only used for illustration, and user can use default.
drscList <- DR.SC_fit(X,Adj_sp=Adj_sp, K=4, maxIter=2, verbose=TRUE)