R: Variance-adjusted Mahalanobis (VAM) algorithm

vam {VAM}

R Documentation

Variance-adjusted Mahalanobis (VAM) algorithm

Description

Implementation of the Variance-adjusted Mahalanobis (VAM) method, which computes distance statistics and one-sided p-values for all cells in the specified single cell gene expression matrix. This matrix should reflect the subset of the full expression profile that corresponds to a single gene set. The p-values will be computed using either a chi-square distribution, a non-central chi-square distribution or gamma distribution as controlled by the center and gamma arguments for the one-sided alternative hypothesis that the expression values in the cell are further from the mean (center=T) or origin (center=F) than expected under the null of uncorrelated technical noise, i.e., gene expression variance is purely technical and all genes are uncorrelated.

Usage

    vam(gene.expr, tech.var.prop, gene.weights, center=FALSE, gamma=TRUE)

Arguments

`gene.expr`	An n x p matrix of gene expression values for n cells and p genes.
`tech.var.prop`	Vector of technical variance proportions for each of the p genes. If specified, the Mahalanobis distance will be computed using a diagonal covariance matrix generated using these proportions. If not specified, the Mahalanobis distances will be computed using a diagonal covariance matrix generated from the sample variances.
`gene.weights`	Optional vector of gene weights. If specified, weights must be > 0. The weights are used to adjust the gene variance values included in the computation of the modified Mahalanobis distances. Specifically, the gene variance is divided by the gene weight. This adjustment means that large weights will increase the influence of a given gene in the computation of the modified Mahalanobis distance.
`center`	If true, will mean center the values in the computation of the Mahalanobis statistic. If false, will compute the Mahalanobis distance from the origin. Default is F.
`gamma`	If true, will fit a gamma distribution to the non-zero squared Mahalanobis distances computed from a row-permuted version of `gene.expr`. The estimated gamma distribution will be used to compute a one-sided p-value for each cell. If false, will compute the p-value using the standard chi-square approximation for the squared Mahalanobis distance (or non-central if `center=F`). Default is T.

Value

A data.frame with the following elements (row names will match row names from gene.expr):

"cdf.value": 1 minus the one-sided p-values computed from the squared adjusted Mahalanobis distances.
"distance.sq": The squared adjusted Mahalanobis distances for the n cells.

Examples

    # Simulate Poisson expression data for 10 genes and 10 cells
    gene.expr=matrix(rpois(100, lambda=2), nrow=10)
    # Simulate technical variance proportions
    tech.var.prop=runif(10)
    # Execute VAM to compute scores for the 10 genes on each cell
    vam(gene.expr=gene.expr, tech.var.prop=tech.var.prop)
    # Create weights that prioritize the first 5 genes
    gene.weights = c(rep(2,5), rep(1,5))
    # Execute VAM using the weights
    vam(gene.expr=gene.expr, tech.var.prop=tech.var.prop, 
    	gene.weights=gene.weights)