Preprocessing {scapGNN} | R Documentation |
Data preprocessing
Description
This function is to prepare data for the ConNetGNN
function.
Usage
Preprocessing(data, parallel.cores = 1, verbose = TRUE)
Arguments
data |
The input data should be a data frame or a matrix where the rows are genes and the columns are cells. The |
parallel.cores |
Number of processors to use when doing the calculations in parallel (default: |
verbose |
Gives information about each step. Default: |
Details
Preprocessing
The function is able to interface with the seurat
framework. The process of seurat
data processing refers to Examples
.
The input data should be containing hypervariable genes and log-transformed. Left-truncated mixed Gaussian (LTMG) modeling to calculate gene
regulatory signal matrix. Positively correlated gene-gene and cell-cell are used as the initial gene correlation matrix and cell correlation matrix.
Value
A list:
- orig_dara
User-submitted raw data, rows are highly variable genes and columns are cells.
- cell_features
Cell feature matrix.
- gene_features
Gene feature matrix.
- ltmg_matrix
Gene regulatory signal matrix for LTMG.
- cell_adj
The adjacency matrix of the cell correlation network.
- gene_adj
The adjacency matrix of the gene correlation network.
Examples
# Load dependent packages.
# require(coop)
# Seurat data processing.
# require(Seurat)
# Load the PBMC dataset (Case data for seurat)
# pbmc.data <- Read10X(data.dir = "../data/pbmc3k/filtered_gene_bc_matrices/hg19/")
# Our recommended data filtering is that only genes expressed as non-zero in more than
# 1% of cells, and cells expressed as non-zero in more than 1% of genes are kept.
# In addition, users can also filter mitochondrial genes according to their own needs.
# pbmc <- CreateSeuratObject(counts = pbmc.data, project = "case",
# min.cells = 3, min.features = 200)
# pbmc[["percent.mt"]] <- PercentageFeatureSet(pbmc, pattern = "^MT-")
# pbmc <- subset(pbmc, subset = nFeature_RNA > 200 & nFeature_RNA < 2500 & percent.mt < 5)
# Normalizing the data.
# pbmc <- NormalizeData(pbmc, normalization.method = "LogNormalize")
# Identification of highly variable features.
# pbmc <- FindVariableFeatures(pbmc, selection.method = 'vst', nfeatures = 2000)
# Run Preprocessing.
# Prep_data <- Preprocessing(pbmc)
# Users can also directly input data
# in data frame or matrix format
# containing highly variable genes.
data("Hv_exp")
Hv_exp <- Hv_exp[,1:20]
Hv_exp <- Hv_exp[which(rowSums(Hv_exp) > 0),]
Prep_data <- Preprocessing(Hv_exp[1:10,])