compare_mir_terms_log2 {miRetrieve} | R Documentation |
Compare log2-frequency count of terms associated with a miRNA name
Description
Compare log2-frequency count of terms associated with a miRNA name over two topics.
Usage
compare_mir_terms_log2(
df,
mir,
top = 20,
token = "words",
...,
topic = NULL,
shared = TRUE,
normalize = TRUE,
stopwords = stopwords_miretrieve,
stopwords_ngram = TRUE,
col.mir = miRNA,
col.abstract = Abstract,
col.topic = Topic,
col.pmid = PMID,
title = NULL
)
Arguments
df |
Data frame containing miRNA names, abstracts, topics, and PubMed-IDs. |
mir |
String. miRNA name of interest. |
top |
Integer. Number of top terms to plot. |
token |
String. Specifies how abstracts shall be split up. Taken from
|
... |
Additional arguments for tokenization, if necessary. |
topic |
Character vector. Optional. Specifies which topics to plot.
Must have length two.
If |
shared |
Boolean. If |
normalize |
Boolean. If |
stopwords |
Data frame containing stop words. |
stopwords_ngram |
Boolean. Specifies if stop words shall be removed
from abstracts when using ngrams. Only applied when |
col.mir |
Symbol. Column containing miRNA names. |
col.abstract |
Symbol. Column containing abstracts. |
col.topic |
Symbol. Column containing topic names. |
col.pmid |
Symbol. Column containing PubMed-IDs. |
title |
String. Plot title. |
Details
Compare log2-frequency count of terms associated with a miRNA name over two topics by
plotting the log2-ratio of the term count associated with a miRNA name
over two topics.
miRNA names and topics must be in a data frame df
, while terms are taken
from abstracts contained in df
.
Number of top terms to plot is regulated by top
. Terms can either be
evaluated as their raw count, e.g. in how many abstracts they are mentioned
in conjunction with the miRNA name, or as their relative count, e.g.
in how many abstracts containing the miRNA they are mentioned compared to all
abstracts containing the miRNA.
compare_mir_terms_log2()
is based on the tools available in the
tidytext package.
The log2-plot is greatly inspired by the book
“tidytext: Text Mining and Analysis Using Tidy Data Principles in R.” by
Silge and Robinson.
Value
List containing bar plot comparing the log2-frequency of terms associated with a miRNA over two topics and its corresponding data frame.
References
Silge, Julia, and David Robinson. 2016. “tidytext: Text Mining and Analysis Using Tidy Data Principles in R.” JOSS 1 (3). The Open Journal. https://doi.org/10.21105/joss.00037.
See Also
compare_mir_terms()
, compare_mir_terms_scatter()
Other compare functions:
compare_mir_count_log2()
,
compare_mir_count_unique()
,
compare_mir_count()
,
compare_mir_terms_scatter()
,
compare_mir_terms_unique()
,
compare_mir_terms()