transcriptome_filter {net4pg} | R Documentation |
Perform transcriptome-informed post-hoc filtering
Description
Implement the transcriptome-informed post-hoc filtering strategy. This strategy aims to reduce the ambiguity of protein identifications by exploiting sample-matched transcriptome information, when available. First, it takes in input the set of transcripts expressed in the sample-matched transcriptome (reported using the transcript identifier in Ensembl format, i.e., ENSTXXXX for human) and removes from proteomic identifications: i. all proteins with no expressed transcripts and peptides exclusively mapping on removed proteins ("all"); or ii. only those exclusively identified by shared peptides and peptides exclusively mapping on removed proteins ("sharedOnly"); or iii. only those exclusively identified by shared peptides, whose peptides are shared with at least one protein with expressed transcript, so they are not to be removed ("sharedNoRemove")
Usage
transcriptome_filter(
incM,
exprTranscriptsFile,
proteinToTranscriptFile,
tagContam,
remove
)
Arguments
incM |
a |
exprTranscriptsFile |
the name of the file containing the set of transcripts expressed in the sample-matched transcriptome (one per line). Transcript identifiers must be in the Ensembl format (i.e., ENSTXXXXXXXXXXX for human) |
proteinToTranscriptFile |
the name of a tab-delimited file with protein identifiers in the first column and the corresponding transcript identifiers in the second column. Protein and transcript identifiers must be in the Ensembl format (i.e. ENSPXXXXXXXXXXX and ENSTXXXXXXXXXXX for human) |
tagContam |
a |
remove |
|
Value
a matrix
representing a filtered incidence matrix of
peptide-to-protein mapping obtained by transcriptome-informed filtering.
Author(s)
Laura Fancello
Examples
# Read the tab-delimited file containing the proteome incidence matrix
incM_filename <- system.file("extdata"
, "incM_example"
, package = "net4pg"
, mustWork = TRUE)
rownames_filename <- system.file("extdata"
, "peptideIDs_incM_example"
, package = "net4pg"
, mustWork = TRUE)
colnames_filename <- system.file("extdata"
, "proteinIDs_incM_example"
, package = "net4pg"
, mustWork = TRUE)
incM <- read_inc_matrix(incM_filename = incM_filename
, colnames_filename = colnames_filename
, rownames_filename = rownames_filename)
# Perform transcriptome-informed post-hoc filtering
exprTranscriptsFile <- system.file("extdata"
, "expressed_transcripts.txt"
, package = "net4pg"
, mustWork = TRUE)
protein2transcriptFile <- system.file("extdata"
, "protein_to_transcript"
, package = "net4pg"
, mustWork = TRUE)
incM_filtered <- transcriptome_filter(incM
, exprTranscriptsFile = exprTranscriptsFile
, proteinToTranscriptFile = protein2transcriptFile
, tagContam = "Contam"
, remove = "all")