calculate_features {ampir} | R Documentation |
Calculate a set of numerical features from protein sequences
Description
This function calculates set physicochemical and compositional features from protein sequences in preparation for supervised model learning
Usage
calculate_features(df, min_len = 10)
Arguments
df |
A dataframe which contains protein sequence names as the first column and amino acid sequence as the second column |
min_len |
Minimum length sequence for which features can be calculated. It is an error to provide sequences with length shorter than this |
Value
A dataframe containing numerical values related to the protein features of each given protein
Note
This function depends on the Peptides package
References
Osorio, D., Rondon-Villarreal, P. & Torres, R. Peptides: A package for data mining of antimicrobial peptides. The R Journal. 7(1), 4–14 (2015).
Examples
my_protein_df <- read_faa(system.file("extdata/bat_protein.fasta", package = "ampir"))
calculate_features(my_protein_df)
## Output (showing the first six output columns)
# seq_name Amphiphilicity Hydrophobicity pI Mw Charge ....
# [1] G1P6H5_MYOLU 0.4145847 0.4373494 8.501312 9013.757 4.53015 ....
[Package ampir version 1.1.0 Index]