R: Row Normalize

rowNormalize {wrMisc}

R Documentation

Row Normalize

Description

This function was designed for normalizing data that is supposed to be particularly similar, like a collection of technical replicates. Thus, initially for each row an independent normalization factor is calculated and the median or mean across all factors will be finally applied to the data. This function has a special mode of operation with higher content of NA values (which may pose problems with other normalization approaches). If the NA-content is higher than the threshold set in sparseLim, a special procedure for sparse data will be applied (iteratively trating subsets of nCombin columns that will be combined in a later step).

Usage

rowNormalize(
  dat,
  method = "median",
  refLines = NULL,
  refGrp = NULL,
  proportMode = TRUE,
  minQuant = NULL,
  sparseLim = 0.4,
  nCombin = 3,
  omitNonAlignable = FALSE,
  maxFact = 10,
  silent = FALSE,
  debug = FALSE,
  callFrom = NULL
)

Arguments

`dat`	matrix or data.frame of data to get normalized
`method`	(character) may be "mean","median" (plus "NULL","none"); When NULL or 'none' is chosen the input will be returned as is
`refLines`	(NULL or numeric) allows to consider only specific lines of 'dat' when determining normalization factors (all data will be normalized)
`refGrp`	(integer) Only the columns indicated will be used as reference, default all columns (integer or colnames)
`proportMode`	(logical) decide if normalization should be done by multiplicative or additive factor
`minQuant`	(numeric) optional filter to set all values below given value as `NA`
`sparseLim`	(integer) decide at which min content of `NA` values the function should go in sparse-mode
`nCombin`	(NULL or integer) used only in sparse-mode (ie if content of `NA`s higher than content of `sparseLim`): Number of groups of smller matrixes with this number of columns to be inspected initailly; low values (small groups have higher chances of more common elements)
`omitNonAlignable`	(logical) allow omitting all columns which can't get aligned due to sparseness
`maxFact`	(numeric, length=2) max normalization factor
`silent`	(logical) suppress messages
`debug`	(logical) additional messages for debugging
`callFrom`	(character) This function allows easier tracking of messages produced

Details

Arguments were kept similar with function normalizeThis as much as possible. In most cases data get normalized by proportional factors. In case of log2-data (very common in omics-data) normalizing by an additive factor is equivalent to a proportional factor.

This function has a special mode of operation for sparse data (ie containing a high content of NA values). 0-values by themselves will be primarily considered as true measurment outcomes and not as missing. However, by using the argument minQuant all values below a given threshold will be set as NA and this may possibly trigger the sparse mode of normalizing.

Note : Using a small value of nCombin will give the highest chances of finding sufficient complete combination of columns with sparse data. However, this will also increase (very much) the computational efforts and time required to produce an output.

When using default proportional mode a potential division by 0 could occur, when the initial normalization factor turns out as 0. In this case a small value (default the maximum value of dat / 10 will be added to all data before normalizing. If this also creates 0-vales in the data this factor will be multiplied by 0.03.

Value

This function returns a matrix of normalized data

Examples

## sparse matrix  normalization
set.seed(2); AA <- matrix(rbinom(110,10,0.05), nrow=10)
AA[,4:5] <- AA[,4:5] *rep(4:3, each=nrow(AA))
AA[2,c(2,6,7)] <- 1; AA[3,8] <- 1;

(AA1 <- rowNormalize(AA))
(AA2 <- rowNormalize(AA, minQuant=1))   # set all 0 as NAs
(AA3 <- rowNormalize(AA, refLines=1:6, omitNonAlignable=FALSE, minQuant=1))

[Package wrMisc version 1.15.1 Index]