R: Combine Multiple Proteomics Data-Sets

fuseProteomicsProjects {wrProteo}

R Documentation

Combine Multiple Proteomics Data-Sets

Description

This function allows combining up to 3 separate data-sets previously imported using wrProteo.

Usage

fuseProteomicsProjects(
  x,
  y,
  z = NULL,
  columnNa = "Accession",
  NA.rm = TRUE,
  listNa = c(quant = "quant", annot = "annot"),
  all = FALSE,
  textModif = NULL,
  shortNa = NULL,
  retProtLst = FALSE,
  silent = FALSE,
  debug = FALSE,
  callFrom = NULL
)

Arguments

`x`	(list) First Proteomics data-set
`y`	(list) Second Proteomics data-set
`z`	(list) optional third Proteomics data-set
`columnNa`	(character) column names from annotation
`NA.rm`	(logical) remove `NA`s
`listNa`	(character) names of key list-elemnts from `x` to be treated; the first one is used as pattern for the format of quantitation data, , the last one for the annotation data
`all`	(logical) union of intersect or merge should be performed between x, y and z
`textModif`	(character) Additional modifications to the identifiers from argument `columnNa`; so far intregrated: `rmPrecAA` for removing preceeding caps letters (amino-acids, eg [KR].AGVIFPVGR.[ML] => AGVIFPVGR) or `rmTerminalDigit` for removing terminal digits (charge-states)
`shortNa`	(character) for appending to output-colnames
`retProtLst`	(logical) return list-object similar to input, otherwise a matrix of fused/aligned quantitation data
`silent`	(logical) suppress messages
`debug`	(logical) additional messages for debugging
`callFrom`	(character) allow easier tracking of messages produced

Details

Some quantification software way give some identifyers multiple times, ie as multiple lines (eg for different modifictions or charge states, etc). In this case this function tries first to summarize all lines with identical identifyers (using the function combineRedundLinesInList which used by default the median value). Thus, it is very important to know your data and to understand when lines that appear with the same identifyers should/may be fused/summarized without doing damage to the later biological interpretation ! The user may specify for each dataset the colum out of the protein/peptide-annotation to use via the argument columnNa. Then, this content will be matched as identical match, so when combining data from different software special care shoud be taken !

Please note, that (at this point) the data from different series/objects will be joined as they are, ie without any additional normalization. It is up to the user to inspect the resulting data and to decide if and which type of normalization may be suitable !

Please do NOT try combining protein and peptide quntification data.

Value

This function returns a list with the same number of list-elements as $x, ie typically this contains : $raw (initial/raw abundance values), $quant with final normalized quantitations, $annot, optionally $counts an array with number of peptides, $quantNotes or $notes

Examples

path1 <- system.file("extdata", package="wrProteo")
dataMQ <- readMaxQuantFile(path1, specPref=NULL, normalizeMeth="median")
MCproFi1 <- "tinyMC.RData"
dataMC <- readMassChroQFile(path1, file=MCproFi1, plotGraph=FALSE)
dataFused <- fuseProteomicsProjects(dataMQ, dataMC)
dim(dataMQ$quant)
dim(dataMC$quant)
dim(dataFused$quant)

[Package wrProteo version 1.12.0 Index]