readDiaNNFile {wrProteo} | R Documentation |
Read Tabulated Files Exported by DIA-NN At Protein Level
Description
This function allows importing protein identification and quantification results from DIA-NN.
Data should be exported as tabulated text (tsv) as protein-groups (pg) to allow import by thus function.
Quantification data and other relevant information will be parsed and extracted (similar to the other import-functions from this package).
The final output is a list containing as (main) elements: $annot
, $raw
and $quant
, or a data.frame with the quantication data and a part of the annotation if argument separateAnnot=FALSE
.
Usage
readDiaNNFile(
fileName,
path = NULL,
normalizeMeth = "median",
sampleNames = NULL,
read0asNA = TRUE,
quantCol = "\\.raw$",
annotCol = NULL,
refLi = NULL,
separateAnnot = TRUE,
FDRCol = NULL,
groupPref = list(lowNumberOfGroups = TRUE),
plotGraph = TRUE,
titGraph = "DiaNN",
wex = 1.6,
specPref = c(conta = "CON_|LYSC_CHICK", mainSpecies = "OS=Homo sapiens"),
gr = NULL,
sdrf = NULL,
suplAnnotFile = FALSE,
silent = FALSE,
debug = FALSE,
callFrom = NULL
)
Arguments
fileName |
(character) name of file to be read |
path |
(character) path of file to be read |
normalizeMeth |
(character) normalization method, defaults to |
sampleNames |
(character) custom column-names for quantification data; this argument has priority over |
read0asNA |
(logical) decide if initial quntifications at 0 should be transformed to NA (thus avoid -Inf in log2 results) |
quantCol |
(character or integer) exact col-names, or if length=1 content of |
annotCol |
(character) column names to be read/extracted for the annotation section (default c("Accession","Description","Gene","Contaminant","Sum.PEP.Score","Coverage....","X..Peptides","X..PSMs","X..Unique.Peptides", "X..AAs","MW..kDa.") ) |
refLi |
(character or integer) custom specify which line of data is main species, if character (eg 'mainSpe'), the column 'SpecType' in $annot will be searched for exact match of the (single) term given |
separateAnnot |
(logical) if |
FDRCol |
- not used (the argument was kept to remain with the same synthax as the other import functions fo this package) |
groupPref |
(list) additional parameters for interpreting meta-data to identify structure of groups (replicates), will be passed to |
plotGraph |
(logical or integer) optional plot of type vioplot of initial and normalized data (using |
titGraph |
(character) custom title to plot of distribution of quantitation values |
wex |
(integer) relative expansion factor of the violin-plot (will be passed to |
specPref |
(character or list) define characteristic text for recognizing (main) groups of species (1st for comtaminants - will be marked as 'conta', 2nd for main species- marked as 'mainSpe',
and optional following ones for supplemental tags/species - maked as 'species2','species3',...);
if list and list-element has multiple values they will be used for exact matching of accessions (ie 2nd of argument |
gr |
(character or factor) custom defined pattern of replicate association, will override final grouping of replicates from |
sdrf |
(character, list or data.frame) optional extraction and adding of experimenal meta-data: if character, this may be the ID at ProteomeExchange,
the second element may give futher indicatations for automatic organization of groups of replicates.
Besides, the output from |
suplAnnotFile |
(logical or character) optional reading of supplemental files; however, if |
silent |
(logical) suppress messages |
debug |
(logical) additional messages for debugging |
callFrom |
(character) allow easier tracking of messages produced |
Details
This function has been developed using DIA-NN version 1.8.x. Note, reading gene-group (gg) files is in priciple possible, but resulting files typically lack protein-identifiers which may be less convenient in later steps of analysis. Thus, it is suggested to rather read protein-group (pg) files.
Using the argument suplAnnotFile
it is possible to specify a specific file (or search for default file) to read for extracting file-names as sample-names and other experiment related information.
Value
This function returns a list with $raw
(initial/raw abundance values), $quant
with final normalized quantitations, $annot
, $counts
an array with number of peptides, $quantNotes
and $notes
; or if separateAnnot=FALSE
the function returns a data.frame with annotation and quantitation only
See Also
read.table
, normalizeThis
) , readMaxQuantFile
, readProtDiscovFile
, readProlineFile
Examples
diaNNFi1 <- "tinyDiaNN1.tsv.gz"
## This file contains much less identifications than one may usually obtain
path1 <- system.file("extdata", package="wrProteo")
## let's define the main species and allow tagging some contaminants
specPref1 <- c(conta="conta|CON_|LYSC_CHICK", mainSpecies="HUMAN")
dataNN <- readDiaNNFile(path1, file=diaNNFi1, specPref=specPref1, tit="Tiny DIA-NN Data")
summary(dataNN$quant)