Trint.Dist.Feature {EncDNA} | R Documentation |
Tri-nucleotide distribution-based encoding of nucleotide sequences.
Description
This encoding scheme was first time adopted by Wei et al. (2013) for prediction of splice sites along with MM1 features. In this encoding technique, distribution of trinucleotides are taken into consideration independently for the exon and intron regions of splice site motifs.
Usage
Trint.Dist.Feature(test_seq)
Arguments
test_seq |
Sequence dataset to be transformed into numeric feature vectors. There should be atleat two sequences, must be an object of class |
Details
This encoding scheme is independent of positive and negative datasets. In other words, each sequence can be encoded independently. Further, nucleotide sequence of any length will be transformed into a numeric vector of 64 observations corresponding to 64 combinations of trinucleotides.
Value
A numeric matrix of order m*64
, where m
is the number of sequences in test_seq
.
Author(s)
Prabina Kumar Meher, Indian Agricultural Statistics Research Institute, New Delhi-110012, INDIA
References
Wei, D., Zhang, H., Wei, Y. and Jiang, Q. (2013). A novel splice site prediction method using support vector machine. J Comput Inform Syst., 920: 8053-8060.
Examples
data(droso)
test <- droso$test
tst <- test
enc <- Trint.Dist.Feature(test_seq=tst)
enc