transcriptome_filter {net4pg}R Documentation

Perform transcriptome-informed post-hoc filtering

Description

Implement the transcriptome-informed post-hoc filtering strategy. This strategy aims to reduce the ambiguity of protein identifications by exploiting sample-matched transcriptome information, when available. First, it takes in input the set of transcripts expressed in the sample-matched transcriptome (reported using the transcript identifier in Ensembl format, i.e., ENSTXXXX for human) and removes from proteomic identifications: i. all proteins with no expressed transcripts and peptides exclusively mapping on removed proteins ("all"); or ii. only those exclusively identified by shared peptides and peptides exclusively mapping on removed proteins ("sharedOnly"); or iii. only those exclusively identified by shared peptides, whose peptides are shared with at least one protein with expressed transcript, so they are not to be removed ("sharedNoRemove")

Usage

transcriptome_filter(
  incM,
  exprTranscriptsFile,
  proteinToTranscriptFile,
  tagContam,
  remove
)

Arguments

incM

a logical matrix containing the incidence matrix with its column and row names (respectively, protein and peptide identifiers) and 0 or 1 values indicating whether or not the peptide maps on the corresponding protein.

exprTranscriptsFile

the name of the file containing the set of transcripts expressed in the sample-matched transcriptome (one per line). Transcript identifiers must be in the Ensembl format (i.e., ENSTXXXXXXXXXXX for human)

proteinToTranscriptFile

the name of a tab-delimited file with protein identifiers in the first column and the corresponding transcript identifiers in the second column. Protein and transcript identifiers must be in the Ensembl format (i.e. ENSPXXXXXXXXXXX and ENSTXXXXXXXXXXX for human)

tagContam

a character vector reporting the tag which identifies contaminant protein

remove

character vector indicating whether to remove: i. all proteins with no expressed transcripts and peptides exclusively mapping on removed proteins ("all"); ii. only those exclusively identified by shared peptides and peptides exclusively mapping on removed proteins ("sharedOnly"); iii. only those exclusively identified by shared peptides, whose peptides are shared with at least one protein with expressed transcript, so they are not to be removed ("sharedNoRemove")

Value

a matrix representing a filtered incidence matrix of peptide-to-protein mapping obtained by transcriptome-informed filtering.

Author(s)

Laura Fancello

Examples

# Read the tab-delimited file containing the proteome incidence matrix
incM_filename <- system.file("extdata"
                             , "incM_example"
                             , package = "net4pg"
                             , mustWork = TRUE)
rownames_filename <- system.file("extdata"
                                  , "peptideIDs_incM_example"
                                  , package = "net4pg"
                                  , mustWork = TRUE)
colnames_filename <- system.file("extdata"
                                 , "proteinIDs_incM_example"
                                 , package = "net4pg"
                                 , mustWork = TRUE)
incM <- read_inc_matrix(incM_filename = incM_filename
                 , colnames_filename = colnames_filename
                 , rownames_filename = rownames_filename)
# Perform transcriptome-informed post-hoc filtering
exprTranscriptsFile <- system.file("extdata"
                                   , "expressed_transcripts.txt"
                                   , package = "net4pg"
                                   , mustWork = TRUE)
protein2transcriptFile <- system.file("extdata"
                                        , "protein_to_transcript"
                                        , package = "net4pg"
                                        , mustWork = TRUE)
incM_filtered <- transcriptome_filter(incM
                         , exprTranscriptsFile = exprTranscriptsFile
                         , proteinToTranscriptFile = protein2transcriptFile
                         , tagContam = "Contam"
                         , remove = "all")


[Package net4pg version 0.1.1 Index]