fetch_uniprot {protti} | R Documentation |
Fetch protein data from UniProt
Description
Fetches protein metadata from UniProt.
Usage
fetch_uniprot(
uniprot_ids,
columns = c("protein_name", "length", "sequence", "gene_names", "xref_geneid",
"xref_string", "go_f", "go_p", "go_c", "cc_interaction", "ft_act_site", "ft_binding",
"cc_cofactor", "cc_catalytic_activity", "xref_pdb"),
batchsize = 200,
max_tries = 10,
timeout = 20,
show_progress = TRUE
)
Arguments
uniprot_ids |
a character vector of UniProt accession numbers. |
columns |
a character vector of metadata columns that should be imported from UniProt (all
possible columns can be found here. For
cross-referenced database provide the database name with the prefix "xref_", e.g. |
batchsize |
a numeric value that specifies the number of proteins processed in a single single query. Default and max value is 200. |
max_tries |
a numeric value that specifies the number of times the function tries to download the data in case an error occurs. |
timeout |
a numeric value that specifies the maximum request time per try. Default is 20 seconds. |
show_progress |
a logical value that determines if a progress bar will be shown. Default is TRUE. |
Value
A data frame that contains all protein metadata specified in columns
for the
proteins provided. The input_id
column contains the provided UniProt IDs. If an invalid ID
was provided that contains a valid UniProt ID, the valid portion of the ID is still fetched and
present in the accession
column, while the input_id
column contains the original not completely
valid ID.
Examples
fetch_uniprot(c("P36578", "O43324", "Q00796"))
# Not completely valid ID
fetch_uniprot(c("P02545", "P02545;P20700"))