estimate_score {tidyestimate}R Documentation

Infer tumor purity using the ESTIMATE algorithm

Description

Infer tumor purity by using single-sample gene-set-enrichment-analysis with stromal and immune cell signatures.

Usage

estimate_score(df, is_affymetrix)

Arguments

df

a data.frame of expression data, where columns are tumors and rows are genes. Gene names must be in the first column, and in the form of HGNC symbols.

is_affymetrix

logical. Is the expression data from an Affymetrix array?

Details

ESTIMATE (and this tidy implementation) infers tumor infiltration using two gene sets: a stromal signature, and an immune signature (see tidyestimate::gene_sets).

Enrichment scores for each sample are calculated using an implementation of single sample Gene Set Enrichment Analysis (ssGSEA). Briefly, expression is ranked on a per-sample basis, and the density and distribution of gene signature 'hits' is determined. An enrichment of hits at the top of the expression ranking confers a positive score, while an enrichment of hits at the bottom of the expression ranking confers a negative score.

An 'ESTIMATE' score is calculated by adding the stromal and immune scores together.

For Affymetrix arrays, an equation to convert an ESTIMATE score to a prediction of tumor purity has been developed by Yoshihara et al. (see references). It takes the approximate form of:

purity = cos(0.61 + 0.00015 * ESTIMATE)

Values have been rounded to two significant figures for display purposes.

Value

A data.frame with sample names, as well as scores for stromal, immune, and ESTIMATE scores per tumor. If is_affymetrix = TRUE, purity scores as well.

Purity scores can be interpreted absolutely: a purity of 0.9 means that tumor is likely 90 available (such as in RNAseq), ESTIMATE scores can only be interpreted relatively: a sample that has a lower ESTIMATE score than another in one study can be regarded as more pure than another, but its absolute purity cannot be inferred, nor can purity across other studies be inferred.

References

Barbie et al. (2009) <doi:10.1038/nature08460>

Yoshihara et al. (2013) <doi:10.1038/ncomms3612>

Examples

filter_common_genes(ov, id = "hgnc_symbol", tidy = FALSE, tell_missing = TRUE, find_alias = TRUE) |> 
  estimate_score(is_affymetrix = TRUE)

[Package tidyestimate version 1.1.1 Index]