PredCRG_Enc {PredCRG}R Documentation

Encoding of protein sequence data in to numeric feature vector based on PredCRG features.

Description

Before using the protein sequences for prediction using the proposed model, the sequences must be transformed into numeric feature vectors. The function PredCRG_Enc will transform each protein sequnces to a numeric vector of 62 observations, based on the compositional, physico-chemical and transitional features used in the PredCRG model.

Usage

PredCRG_Enc(prot_seq)

Arguments

prot_seq

Sequence dataset to be supplied as input, must be an object of class AAStringSet

Details

The dataset must contains the protein sequences having standard amino acid residues only. The clas AAStringSet can be obtained by reading the FASTA file using readAAStringSet available in bioconductor package Biostrings.

Value

A matrix of dimension n*62, for n number of sequences.

Author(s)

Prabina Kumar Meher, ICAR-Indian Agricultural Statistics Research Institute, New Delhi-110012, INDIA

See Also

PredCRG, PredCRG_training, model1, model2,model3,model4

Examples

data(test)
enc <- PredCRG_Enc(test)#encoding of test sequence data
enc[1:5,1:5]
  

[Package PredCRG version 1.0.2 Index]