read_baskets {arulesSequences} | R Documentation |
Read Transaction Data
Description
Read transaction data in basket format (with additional temporal
or other information) and create an object of class
transactions
.
Usage
read_baskets(con, sep = "[ \t]+", info = NULL, iteminfo = NULL,
encoding = "unknown")
Arguments
con |
an object of class |
sep |
a regular expression specifying how fields are separated in the data file. |
info |
a character vector specifying the header for columns with additional transaction information. |
iteminfo |
a data frame specifying (additional) item information. |
encoding |
a character string indicating the encoding which is passed
to |
.
Details
Each line of text represents a transaction where items are
separated by a pattern matching the regular expression specified
by sep
.
Columns with additional information such as customer or time (event)
identifiers are required to come before any item identifiers and to
be separated by sep
, and must be specified by info
.
Sequential data are identified by the presence of the column identifiers "sequenceID" (sequence or customer identifier) and "eventID" (time or event identifier) of transactionInfo.
The row names of iteminfo
must match the item identifiers
present in the data. However, iteminfo
need not contain a
labels column.
Value
An object of class transactions
.
Note
The item labels are sorted in the order they appear first in the data.
Author(s)
Christian Buchta
See Also
Class
timedsequences
,
transactions
,
function
cspade
.
Examples
## read example data
x <- read_baskets(con = system.file("misc", "zaki.txt", package =
"arulesSequences"),
info = c("sequenceID","eventID","SIZE"))
as(x, "data.frame")
## Not run:
## calendar dates
transactionInfo(x)$Date <-
as.Date(transactionInfo(x)$eventID, origin = "2015-04-01")
transactionInfo(x)
all.equal(transactionInfo(x)$eventID,
as.integer(transactionInfo(x)$Date - as.Date("2015-04-01")))
## End(Not run)