impute.pa {imp4p}R Documentation

Imputation of peptides having no value in a biological condition (present in a condition / absent in another).

Description

This function imputes missing values by small values.

Usage

impute.pa(tab, conditions, q.min = 0.025, q.norm = 3, eps = 0,
distribution = "unif", param1 = 3, param2 = 1, R.q.min=1)

Arguments

tab

A data matrix containing numeric and missing values. Each column of this matrix is assumed to correspond to an experimental sample, and each row to an identified peptide.

conditions

A vector of factors indicating the biological condition to which each column (experimental sample) belongs.

q.min

A quantile value of the observed values allowing defining the maximal value which can be generated. This maximal value is defined by the quantile q.min of the observed values distribution minus eps. Default is 0.025 (the maximal value is the 2.5 percentile of observed values minus eps).

q.norm

A quantile value of a normal distribution allowing defining the minimal value which can be generated. Default is 3 (the minimal value is the maximal value minus qn*median(sd(observed values)) where sd is the standard deviation of a row in a condition).

eps

A value allowing defining the maximal value which can be generated. This maximal value is defined by the quantile q.min of the observed values distribution minus eps. Default is 0.

distribution

Distribution used to generated missing values. You have the choice between "unif" for the uniform distribution, "beta" for the Beta distribution or "dirac" for the Dirac distribution. Default is "unif".

param1

Parameter shape1 of the Beta distribution.

param2

Parameter shape2 of the Beta distribution.

R.q.min

Parameter used for the Dirac distribution. In this case, all the missing values are imputed by a single value which is equal to R.q.min*quantile(tab[,j], probs=q.min, na.rm=T). Default is 1 : the imputed value is the qmin quantile of observed values.

Details

This function replaces the missing values in a column by random draws from a specified distribution. The value of eps can be interpreted as a minimal fold-change value above which the present/absent peptides appear.

Value

A list composed of :

- tab.imp : the input matrix tab with imputed values instead of missing values.

- para : the parameters of the distribution which has been used to impute.

Author(s)

Quentin Giai Gianetto <quentin2g@yahoo.fr>

Examples


#Simulating data
res.sim=sim.data(nb.pept=2000,nb.miss=600);

#Imputation of the simulated data set with small values
data.small.val=impute.pa(res.sim$dat.obs,res.sim$conditions);


[Package imp4p version 1.2 Index]