psrn {baldur}R Documentation

Normalize data to a pseudo-reference

Description

[Experimental]

This function generates a pseudo-reference by taking the geometric mean of each peptide across all samples. Each peptide in each sample is then divided by the pseudo-reference. Then, the median ratio of all ratios is used as an estimate to use for normalizing differences in loading concentration. All features in each sample is then divided by their corresponding estimate. All estimates are based on features without missing values. For details see Anders and Huber (2010).

Usage

psrn(data, id_col, log = TRUE, load_info = FALSE, target = NULL)

Arguments

data

data.frame

id_col

a character for the name of the column containing the name of the features in data (e.g., peptides, proteins, etc.)

log

boolean variable indicating if the data should be log transformed after normalization

load_info

logical; should the load information be output?

target

target columns to normalize, supports tidyselect-package syntax. By default, all numerical columns will be used in the normalization if not specified.

Value

data frame with normalized values if load_info=FALSE, if it is TRUE then it returns a list with two tibbles. One tibble containing the normalized data and one containing the loading info as well as the estimated normalization factors.

Source

https://www.nature.com/articles/npre.2010.4282.1

References

Simon Anders, Wolfgang Huber (2010). “Differential expression analysis for sequence count data.” Nature Precedings, 1–1.

Examples

yeast_psrn <- psrn(yeast, "identifier")
yeast_psrn_with_load <- psrn(yeast, "identifier", load_info = TRUE)
yeast_ng50_only <- psrn(yeast, "identifier", target = matches('ng50'))

[Package baldur version 0.0.3 Index]