R: Mining Frequent Contiguous Sequential Patterns in a Text...

CSeqpat {CSeqpat}

R Documentation

Mining Frequent Contiguous Sequential Patterns in a Text Corpus

Description

Takes in the filepath and minimum support and performs pattern mining

Usage

CSeqpat(filepath, phraselenmin = 1, phraselenmax = 99999, minsupport = 1,
  docdelim, stopword = FALSE, stemword = FALSE, lower = FALSE,
  removepunc = FALSE)

Arguments

`filepath`	Path to the text file/text corpus
`phraselenmin`	Minimum number of words in a phrase
`phraselenmax`	Maximum number of words in a phrase
`minsupport`	Minimum absolute support for mining the patterns
`docdelim`	Document delimiter in the corpus
`stopword`	Remove stopwords from the document corpus (boolean)
`stemword`	Perform stemming on the document corpus (boolean)
`lower`	Lower case all words in document corpus (boolean)
`removepunc`	Remove punctuations from document corpus (boolean)

Value

A dataframe containing the frequent phrase patterns with their absolute support

Examples

test1 <- c("hoagie institution food year road ",
"place little dated opened weekend fresh food")
tf <- tempfile()
writeLines(test1, tf)
CSeqpat(tf,1,2,2,"\t",TRUE,FALSE,TRUE,FALSE)

[Package CSeqpat version 0.1.2 Index]