lnc_finder {LncFinder} | R Documentation |
Long Non-coding RNA Identification
Description
This function is used to predict sequences are non-coding transcripts or protein-coding transcripts.
Usage
lnc_finder(
Sequences,
SS.features = FALSE,
format = "DNA",
frequencies.file = "human",
svm.model = "human",
parallel.cores = 2
)
Arguments
Sequences |
Unevaluated sequences. Can be a FASTA file loaded by
|
SS.features |
Logical. If |
format |
String. Define the format of the |
frequencies.file |
String or a list obtained from function
|
svm.model |
String or a svm model obtained from function |
parallel.cores |
Integer. The number of cores for parallel computation.
By default the number of cores is |
Details
Considering that it is time consuming to obtain secondary structure
sequences, users can input nucleotide sequences and predict these sequences
without secondary structure features (Set SS.features
as FALSE
).
Please note that:
SS.features
can improve the performance when the species of unevaluated
sequences is identical to the species of the sequences that used to build the
model.
However, if users are trying to predict sequences with the model trained on
other species, SS.features
may lead to low accuracy.
For the details of frequencies.file
, please refer to function
make_frequencies
.
For the details of the features, please refer to function
extract_features
.
Value
Returns a data.frame. Including the results of prediction (Pred
);
coding potential (Coding.Potential
) and the features. For the details
of the features, please refer to function extract_features
.
References
Siyu Han, Yanchun Liang, Qin Ma, Yangyi Xu, Yu Zhang, Wei Du, Cankun Wang & Ying Li. LncFinder: an integrated platform for long non-coding RNA identification utilizing sequence intrinsic composition, structural information, and physicochemical property. Briefings in Bioinformatics, 2019, 20(6):2009-2027.
Author(s)
HAN Siyu
See Also
build_model
, make_frequencies
,
extract_features
, run_RNAfold
, read_SS
.
Examples
## Not run:
data(demo_DNA.seq)
Seqs <- demo_DNA.seq
### Input one sequence:
OneSeq <- Seqs[1]
result_1 <- lnc_finder(OneSeq, SS.features = FALSE, format = "DNA",
frequencies.file = "human", svm.model = "human",
parallel.cores = 2)
### Or several sequences:
data(demo_SS.seq)
Seqs <- demo_SS.seq
result_2 <- lnc_finder(Seqs, SS.features = TRUE, format = "SS",
frequencies.file = "mouse", svm.model = "mouse",
parallel.cores = 2)
### A complete work flow:
### Calculate second structure on Windows OS,
RNAfold.path <- '"E:/Program Files/ViennaRNA/RNAfold.exe"'
SS.seq <- run_RNAfold(Seqs, RNAfold.path = RNAfold.path, parallel.cores = 2)
### Predict the sequences with secondary structure features,
result_2 <- lnc_finder(SS.seq, SS.features = TRUE, format = "SS",
frequencies.file = "mouse", svm.model = "mouse",
parallel.cores = 2)
### Predict sequences with your own model by assigning a new svm.model and
### frequencies.file to parameters "svm.model" and "frequencies.file"
## End(Not run)