PredCRG_Enc {PredCRG} | R Documentation |
Encoding of protein sequence data in to numeric feature vector based on PredCRG features.
Description
Before using the protein sequences for prediction using the proposed model, the sequences must be transformed into numeric feature vectors. The function PredCRG_Enc
will transform each protein sequnces to a numeric vector of 62 observations, based on the compositional, physico-chemical and transitional features used in the PredCRG
model.
Usage
PredCRG_Enc(prot_seq)
Arguments
prot_seq |
Sequence dataset to be supplied as input, must be an object of class |
Details
The dataset must contains the protein sequences having standard amino acid residues only. The clas AAStringSet
can be obtained by reading the FASTA file using readAAStringSet
available in bioconductor package Biostrings
.
Value
A matrix of dimension n*62, for n number of sequences.
Author(s)
Prabina Kumar Meher, ICAR-Indian Agricultural Statistics Research Institute, New Delhi-110012, INDIA
See Also
PredCRG, PredCRG_training, model1, model2,model3,model4
Examples
data(test)
enc <- PredCRG_Enc(test)#encoding of test sequence data
enc[1:5,1:5]