cleanNLP-package {cleanNLP} | R Documentation |
cleanNLP: A Tidy Data Model for Natural Language Processing
Description
Provides a set of fast tools for converting a textual corpus into a set of normalized tables. Multiple NLP backends can be used, with the output standardized into a normalized format. Options include stringi (very fast, but only provides tokenization), udpipe (fast, many languages, includes part of speech tags and dependencies), and spacy (python backend; includes named entity recognition).
Details
Once the package is set up, run one of cnlp_init_stringi
,
cnlp_init_spacy
, or cnlp_init_udpipe
to load
the desired NLP backend. After this function is done running, use
cnlp_annotate
to run the annotation engine over a corpus of
text. The package vignettes provide more detailed set-up information.
See Also
Useful links:
Examples
## Not run:
library(cleanNLP)
# load the annotation engine
cnlp_init_stringi()
# annotate your text
input <- data.frame(
text=c(
"This is a sentence.",
"Here is something else to parse!"
),
stringsAsFactors=FALSE
)
## End(Not run)
[Package cleanNLP version 3.1.0 Index]