R: Fetch protein data from UniProt

fetch_uniprot {protti}

R Documentation

Fetch protein data from UniProt

Description

Fetches protein metadata from UniProt.

Usage

fetch_uniprot(
  uniprot_ids,
  columns = c("protein_name", "length", "sequence", "gene_names", "xref_geneid",
    "xref_string", "go_f", "go_p", "go_c", "cc_interaction", "ft_act_site", "ft_binding",
    "cc_cofactor", "cc_catalytic_activity", "xref_pdb"),
  batchsize = 200,
  max_tries = 10,
  timeout = 20,
  show_progress = TRUE
)

Arguments

`uniprot_ids`	a character vector of UniProt accession numbers.
`columns`	a character vector of metadata columns that should be imported from UniProt (all possible columns can be found here. For cross-referenced database provide the database name with the prefix "xref_", e.g. `"xref_pdb"`)
`batchsize`	a numeric value that specifies the number of proteins processed in a single single query. Default and max value is 200.
`max_tries`	a numeric value that specifies the number of times the function tries to download the data in case an error occurs.
`timeout`	a numeric value that specifies the maximum request time per try. Default is 20 seconds.
`show_progress`	a logical value that determines if a progress bar will be shown. Default is TRUE.

Value

A data frame that contains all protein metadata specified in columns for the proteins provided. The input_id column contains the provided UniProt IDs. If an invalid ID was provided that contains a valid UniProt ID, the valid portion of the ID is still fetched and present in the accession column, while the input_id column contains the original not completely valid ID.

Examples


fetch_uniprot(c("P36578", "O43324", "Q00796"))

# Not completely valid ID
fetch_uniprot(c("P02545", "P02545;P20700"))

[Package protti version 0.9.0 Index]