presenceFilt {wrMisc} | R Documentation |
Filter lines of matrix for max number of NAs
Description
This function produces a logical matrix to be used as filter for lines of 'dat' for sufficient presence of non-NA
values (ie limit number of NAs per line).
Filter abundance/expression data for min number and/or ratio of non-NA
values in at east 1 of multiple groups.
This type of procedure is common in proteomics and tanscriptomics, where a NA
can many times be assocoaued with quantitation below detetction limit.
Usage
presenceFilt(
dat,
grp,
maxGrpMiss = 1,
ratMaxNA = 0.8,
minVal = NULL,
silent = FALSE,
debug = FALSE,
callFrom = NULL
)
Arguments
dat |
matrix or data.frame (abundance or expression-values which may contain some |
grp |
factor of min 2 levels describing which column of 'dat' belongs to which group (levels 1 & 2 will be used) |
maxGrpMiss |
(numeric) at least 1 group has not more than this number of NAs (otherwise marke line as bad) |
ratMaxNA |
(numeric) at least 1 group reaches this content of non- |
minVal |
(default NULL or numeric), any value below will be treated like |
silent |
(logical) suppress messages |
debug |
(logical) additional messages for debugging |
callFrom |
(character) allow easier tracking of messages produced |
Value
logical matrix (with separate col for each pairwise combination of 'grp' levels) indicating if line of 'dat' acceptable based on NA
s (and values minVal)
See Also
presenceGrpFilt
, there are also other packages totaly dedicated to filtering on CRAN and Bioconductor
Examples
mat <- matrix(rep(8,150), ncol=15, dimnames=list(NULL,
paste0(rep(LETTERS[4:2],each=6),1:6)[c(1:5,7:16)]))
mat[lower.tri(mat)] <- NA
mat[,15] <- NA
mat[c(2:3,9),14:15] <- NA
mat[c(1,10),13:15] <- NA
mat
presenceFilt(mat ,rep(LETTERS[4:2], c(5,6,4)))
presenceFilt(mat, rep(1:2,c(9,6)))
# one more example
dat1 <- matrix(1:56, ncol=7)
dat1[c(2,3,4,5,6,10,12,18,19,20,22,23,26,27,28,30,31,34,38,39,50,54)] <- NA
dat1; presenceFilt(dat1,gr=gl(3,3)[-(3:4)], maxGr=0)
presenceFilt(dat1, gr=gl(2,4)[-1], maxGr=1, ratM=0.1)
presenceFilt(dat1, gr=gl(2,4)[-1], maxGr=2, rat=0.5)