PSTNPss_RNA {ftrCOOL} | R Documentation |
Position-Specific Tri-ribonucleotide Propensity based on single-strand RNA (PSTNPss_RNA)
Description
The inputs to this function are positive and negative data sets and a set of sequences. The output of the function is a matrix of feature vectors. The number of rows of the output matrix is equal to the number of sequences. The feature vector for an input sequence with length L is [u(1),u(2),...u(L-2)]. For each input sequence, u(1) is calculated by subtracting the frequency of sequences (which start with the same tri-ribonucleotides as the input sequence) in the positive set with those starting with the same tri-ribonucleotide in the negative set. We compute u(i) like u(1) with the exception that instead of the first tri-ribonucleotide, the ith tri-ribonucletide is considered.
Usage
PSTNPss_RNA(seqs, pos, neg, label = c())
Arguments
seqs |
is a FASTA file containing ribonucleotide sequences. The sequences start with '>'. Also, seqs could be a string vector. Each element of the vector is a ribonucleotide sequence. |
pos |
is a fasta file containing ribonucleotide sequences. Each sequence starts with '>'. Also, the value of this parameter can be a string vector. The sequences are positive sequences in the training model |
neg |
is a fasta file containing ribonucleotide sequences. Each sequence starts with '>'. Also, the value of this parameter can be a string vector. |
label |
is an optional parameter. It is a vector whose length is equal to the number of sequences. It shows the class of each entry (i.e., sequence). |
Value
It returns a feature matrix. The number of columns is equal to the length of sequences minus two and the number of rows is equal to the number of sequences.
Note
The length of the sequences in positive and negative data sets and the input sets should be equal.
Examples
ptmSeqsADR<-system.file("extdata/",package="ftrCOOL")
posSeqs<-fa.read(file=paste0(ptmSeqsADR,"/pos2RNA.txt"),alphabet="rna")
negSeqs<-fa.read(file=paste0(ptmSeqsADR,"/neg2RNA.txt"),alphabet="rna")
seqs<-fa.read(file=paste0(ptmSeqsADR,"/testSeq2RNA.txt"),alphabet="rna")
PSTNPss_RNA(seqs=seqs,pos=posSeqs,neg=negSeqs)