threads {tm.plugin.mail} | R Documentation |
E-Mail Threads
Description
Extract threads (i.e., chains of messages on a single subject) from e-mail documents.
Usage
threads(x)
Arguments
x |
A corpus consisting of e-mails ( |
Details
This function uses a one-pass algorithm for extracting the thread information by inspecting the “References” header. Some mails (e.g., reply mails appearing before their corresponding base mails) might not be tagged correctly.
Value
A list with the two named components ThreadID
and
ThreadDepth
, listing a thread and the level of replies for each
mail in the corpus x
.
Examples
require("tm")
newsgroup <- system.file("mails", package = "tm.plugin.mail")
news <- VCorpus(DirSource(newsgroup),
readerControl = list(reader = readMail))
vapply(news, meta, "id", FUN.VALUE = "")
lapply(news, function(x) meta(x, "header")$References)
(info <- threads(news))
lengths(split(news, info$ThreadID))
[Package tm.plugin.mail version 0.2-2 Index]