deconvolute_and_contextualize {scMappR} | R Documentation |
Generate cell weighted Fold-Changes (cwFold-changes)
Description
This function takes a count matrix, signature matrix, and differentially expressed genes (DEGs) before generating cwFold-changes for each cell-type.
Usage
deconvolute_and_contextualize(
count_file,
signature_matrix,
DEG_list,
case_grep,
control_grep,
max_proportion_change = -9,
print_plots = TRUE,
plot_names = "scMappR",
theSpecies = "human",
FC_coef = TRUE,
sig_matrix_size = 3000,
drop_unknown_celltype = TRUE,
toSave = FALSE,
path = NULL,
deconMethod = "DeconRNASeq",
rareCT_filter = TRUE
)
Arguments
count_file |
Normalized (e.g. CPM, TPM, RPKM) RNA-seq count matrix where rows are gene symbols and columns are individuals. Either the matrix itself of class "matrix" or data.frame" or a path to a tsv file containing these DEGs. The gene symbols in the count file, signature matrix, and DEG list must match. |
signature_matrix |
Signature matrix (fold-change ratios) of cell-type specificity of genes. Either the object itself or a pathway to an .RData file containing an object named "wilcoxon_rank_mat_or". We strongly recommend inputting the signature matrix directly. |
DEG_list |
An object with the first column as gene symbols within the bulk dataset (doesn't have to be in signature matrix), second column is the adjusted P-value, and the third the log2FC. Path to a tsv file containing this info is also acceptable. |
case_grep |
Tag in the column name for cases (i.e. samples representing upregulated) OR an index of cases. |
control_grep |
Tag in the column name for control (i.e. samples representing downregulated) OR an index of cases. |
max_proportion_change |
Maximum cell-type proportion change. May be useful if a cell-type does not exist in one condition, thus preventing infinite values. |
print_plots |
Whether boxplots of the estimated CT proportion for the leave-one-out method of CT deconvolution should be printed (T/F). |
plot_names |
If plots are being printed, the pre-fix of their .pdf files. |
theSpecies |
internal species designation to be passed from 'scMappR_and_pathway_analysis'. It only impacts this function if data are taken directly from the PanglaoDB database (i.e. not reprocessed by scMappR or the user). |
FC_coef |
Making cwFold-changes based on fold-change (TRUE) or rank := (-log10(Pval)) (FALSE) rank. After testing, we strongly recommend to keep true (T/F). |
sig_matrix_size |
Number of genes in signature matrix for cell-type deconvolution. |
drop_unknown_celltype |
Whether or not to remove "unknown" cell-types from the signature matrix (T/F). |
toSave |
Allow scMappR to write files in the current directory (T/F). |
path |
If toSave == TRUE, path to the directory where files will be saved. |
deconMethod |
Which RNA-seq deconvolution method to use to estimate cell-type proporitons. Options are "WGCNA", "DCQ", or "DeconRNAseq" |
rareCT_filter |
option to keep cell-types rarer than 0.1 percent of the population (T/F). Setting to FALSE may lead to false-positives. |
Details
This function completes the pre-processing, normalization, and scaling steps in the scMappR algorithm before calculating cwFold-changes. cwFold-changes scales bulk fold-changes by the cell-type specificity of the gene, cell-type gene-normalized cell-type proportions, and the reciprocal ratio of cell-type proportions between the two conditions. cwFold-changes are generated for genes that are in both the count matrix and in the list of DEGs. It does not have to also be in the signature matrix. First, this function will estimate cell-type proportions with all genes included before estimating changes in cell-type proportion between case/control using a t-test. Then, it takes a leave-one-out approach to cell-type deconvolution such that estimated cell-type proportions are computed for every inputted DEG. Optionally, the differences between cell-type proportions before and after a gene is removed is plotted in boxplots. Then, for every gene, cwFold-changes are computed with the following formula (the example for upreguatled genes) val <- cell-preferences * cell-type_proportion * cell-type_proportion_fold-change * sign*2^abs(gene_DE$log2fc). A matrix of cwFold-changes for all DEGs are returned.
Value
List with the following elements:
cellWeighted_Foldchange |
data frame of cellweightedFold changes for each gene. |
cellType_Proportions |
data frame of cell-type proportions from DeconRNA-seq. |
leave_one_out_proportions |
data frame of average cell-type proportions for case and control when gene is removed. |
processed_signature_matrix |
signature matrix used in final analysis. |
Examples
data(PBMC_example)
bulk_DE_cors <- PBMC_example$bulk_DE_cors
bulk_normalized <- PBMC_example$bulk_normalized
odds_ratio_in <- PBMC_example$odds_ratio_in
case_grep <- "_female"
control_grep <- "_male"
max_proportion_change <- 10
print_plots <- FALSE
theSpecies <- "human"
cwFC <- deconvolute_and_contextualize(count_file = bulk_normalized,
signature_matrix = odds_ratio_in, DEG_list = bulk_DE_cors,
case_grep = case_grep, control_grep = control_grep,
max_proportion_change = max_proportion_change,
print_plots = print_plots,
theSpecies = theSpecies, toSave = FALSE)