| tm.plugin.europresse-package {tm.plugin.europresse} | R Documentation |
A plug-in for the tm text mining framework to import articles from Europresse
Description
This package provides a tm Source to create corpora from articles exported from the Europresse content provider as HTML files.
Details
Typical usage is to create a corpus from HTML files
exported from Europresse (here called myEuropresseArticles.html).
Frequently, it is necessary to specify the encoding of the texts
via link{EuropresseSource}'s encoding argument.
# Import corpus
source <- EuropresseSource("myEuropresseArticles.html")
corpus <- Corpus(source)
# See how many articles were imported
corpus
# See the contents of the first article and its meta-data
inspect(corpus[1])
meta(corpus[[1]])
See link{EuropresseSource} for more details and real examples.
Author(s)
Milan Bouchet-Valat <nalimilan@club.fr>
References
[Package tm.plugin.europresse version 1.4 Index]