| Preprocessing {scapGNN} | R Documentation |
Data preprocessing
Description
This function is to prepare data for the ConNetGNN function.
Usage
Preprocessing(data, parallel.cores = 1, verbose = TRUE)
Arguments
data |
The input data should be a data frame or a matrix where the rows are genes and the columns are cells. The |
parallel.cores |
Number of processors to use when doing the calculations in parallel (default: |
verbose |
Gives information about each step. Default: |
Details
Preprocessing
The function is able to interface with the seurat framework. The process of seurat data processing refers to Examples.
The input data should be containing hypervariable genes and log-transformed. Left-truncated mixed Gaussian (LTMG) modeling to calculate gene
regulatory signal matrix. Positively correlated gene-gene and cell-cell are used as the initial gene correlation matrix and cell correlation matrix.
Value
A list:
- orig_dara
User-submitted raw data, rows are highly variable genes and columns are cells.
- cell_features
Cell feature matrix.
- gene_features
Gene feature matrix.
- ltmg_matrix
Gene regulatory signal matrix for LTMG.
- cell_adj
The adjacency matrix of the cell correlation network.
- gene_adj
The adjacency matrix of the gene correlation network.
Examples
# Load dependent packages.
# require(coop)
# Seurat data processing.
# require(Seurat)
# Load the PBMC dataset (Case data for seurat)
# pbmc.data <- Read10X(data.dir = "../data/pbmc3k/filtered_gene_bc_matrices/hg19/")
# Our recommended data filtering is that only genes expressed as non-zero in more than
# 1% of cells, and cells expressed as non-zero in more than 1% of genes are kept.
# In addition, users can also filter mitochondrial genes according to their own needs.
# pbmc <- CreateSeuratObject(counts = pbmc.data, project = "case",
# min.cells = 3, min.features = 200)
# pbmc[["percent.mt"]] <- PercentageFeatureSet(pbmc, pattern = "^MT-")
# pbmc <- subset(pbmc, subset = nFeature_RNA > 200 & nFeature_RNA < 2500 & percent.mt < 5)
# Normalizing the data.
# pbmc <- NormalizeData(pbmc, normalization.method = "LogNormalize")
# Identification of highly variable features.
# pbmc <- FindVariableFeatures(pbmc, selection.method = 'vst', nfeatures = 2000)
# Run Preprocessing.
# Prep_data <- Preprocessing(pbmc)
# Users can also directly input data
# in data frame or matrix format
# containing highly variable genes.
data("Hv_exp")
Hv_exp <- Hv_exp[,1:20]
Hv_exp <- Hv_exp[which(rowSums(Hv_exp) > 0),]
Prep_data <- Preprocessing(Hv_exp[1:10,])