Revisions {tm.plugin.dc} | R Documentation |
Revisions of a Distributed Corpus
Description
Each modification of the documents in the corpus results in a new
stage, i.e., revision of the corpus. To allow fast switching
between multiple revisions all modifications may be kept on the file
system. The function setRevision()
allows to go back to any
stage in the history of the corpus. The function
keepRevisions()
shows if revisions are turned on or off; the
corresponding replacement function is used to set the desired
behavior.
Usage
getRevisions( corpus )
removeRevision( corpus, revision )
setRevision( corpus, revision )
keepRevisions( corpus )
`keepRevisions<-`( corpus, value )
Arguments
corpus |
A distributed corpus of class |
revision |
The revision which is to be set as active or removed. |
value |
A logical indicating whether revisions should be kept or not. |
Value
Whereas getRevisions()
returns a list of character strings naming all
available revisions, setRevision()
returns the distributed
corpus with the given revision marked as active. The function
keepRevisions()
returns a logical indicating whether revisions
are used or not.
Examples
## provide data on storage
data("crude")
dc <- as.DCorpus(crude)
## do some preprocessing
dc <- tm_map(dc, content_transformer(tolower))
## retrieve available revisions
revs <- getRevisions(dc)
revs
## go back to original revision
setRevision(dc, revs[2])
keepRevisions(dc)
keepRevisions(dc) <- FALSE