tm.plugin.factiva-package {tm.plugin.factiva} | R Documentation |
A plug-in for the tm text mining framework to import articles from Factiva
Description
This package provides a tm Source to create corpora from articles exported from Dow Jones's Factiva content provider as XML or HTML files.
Details
Typical usage is to create a corpus from a XML or HTML files
exported from Factiva (here called myFactivaArticles.xml
). Setting
language=NA
allows the language to be set automatically from the
information provided by Factiva:
# Import corpus source <- FactivaSource("myFactivaArticles.xml") corpus <- Corpus(source, list(language=NA)) # See how many articles were imported corpus # See the contents of the first article and its meta-data inspect(corpus[1]) meta(corpus[[1]])
Currently, only HTML files saved in French are supported. Please send the maintainer examples of LexisNexis files in your language if you want it to be supported.
See link{FactivaSource}
for more details and real examples.
Author(s)
Milan Bouchet-Valat <nalimilan@club.fr>
References
[Package tm.plugin.factiva version 1.8 Index]