itemMatrix-class {arules} | R Documentation |
Class itemMatrix — Sparse Binary Incidence Matrix to Represent Sets of Items
Description
The itemMatrix
class is the basic building block for transactions,
and associations. The class contains a sparse
Matrix representation of a set of itemsets and the
corresponding item labels.
Usage
## S4 method for signature 'itemMatrix'
summary(object, maxsum = 6, ...)
## S4 method for signature 'itemMatrix'
dim(x)
nitems(x, ...)
## S4 method for signature 'itemMatrix'
nitems(x)
## S4 method for signature 'itemMatrix'
length(x)
toLongFormat(from, ...)
## S4 method for signature 'itemMatrix'
toLongFormat(from, cols = c("ID", "item"), decode = TRUE)
## S4 method for signature 'itemMatrix'
labels(object, itemSep = ",", setStart = "{", setEnd = "}")
itemLabels(object, ...)
itemLabels(object) <- value
## S4 method for signature 'itemMatrix'
itemLabels(object)
## S4 replacement method for signature 'itemMatrix'
itemLabels(object) <- value
itemInfo(object)
itemInfo(object) <- value
## S4 method for signature 'itemMatrix'
itemInfo(object)
## S4 replacement method for signature 'itemMatrix'
itemInfo(object) <- value
itemsetInfo(object)
itemsetInfo(object) <- value
## S4 method for signature 'itemMatrix'
itemsetInfo(object)
## S4 replacement method for signature 'itemMatrix'
itemsetInfo(object) <- value
## S4 method for signature 'itemMatrix'
dimnames(x)
## S4 replacement method for signature 'itemMatrix,list'
dimnames(x) <- value
Arguments
object , x , from |
the object. |
maxsum |
integer, how many items should be shown for the summary? |
... |
further parameters |
cols |
columns for the long format. |
decode |
decode item IDs to item labels. |
itemSep |
item separator symbol. |
setStart |
set start symbol. |
setEnd |
set end symbol. |
value |
replacement value |
Details
Representation
Sets of itemsets are represented as a compressed sparse binary matrix. Conceptually, columns represent items and rows are the sets/transactions. In the compressed form, each itemset is a vector of column indices (called item IDs) representing the items.
Warning: Ideally, we would store the matrix as a row-oriented sparse
matrix (ngRMatrix
), but the Matrix package provides better support for
column-oriented sparse classes (ngCMatrix
). The matrix is therefore internally stored
in transposed form.
Working with several itemMatrix
objects
If you work with several itemMatrix
objects at the same time (e.g.,
several transaction sets, lhs and rhs of a rule, etc.), then the encoding
(itemLabes and order of the items in the binary matrix) in the different
itemMatrices is important and needs to conform. See itemCoding
to learn how to encode and recode itemMatrix
objects.
Functions
-
summary(itemMatrix)
: show a summary. -
dim(itemMatrix)
: returns the number of rows (itemsets) and columns (items in the encoding). -
nitems(itemMatrix)
: returns the number of items in the encoding. -
length(itemMatrix)
: returns the number of itemsets (rows) in the matrix. -
toLongFormat(itemMatrix)
: convert the sets to long format (a data.frame with two columns, ID and item). Column names can be specified as a character vector of length 2 calledcols
. -
labels(itemMatrix)
: returns labels for the itemsets. The following arguments can be used to customize the representation of the labels:itemSep
,setStart
andsetEnd
. -
itemLabels(itemMatrix)
: returns the item labels used for encoding as a character vector. -
itemLabels(itemMatrix) <- value
: replaces the item labels used for encoding. -
itemInfo(itemMatrix)
: returns the whole item/column information data.frame including labels. -
itemInfo(itemMatrix) <- value
: replaces the item/column info by a data.frame. -
itemsetInfo(itemMatrix)
: returns the item set/row information data.frame. -
itemsetInfo(itemMatrix) <- value
: replaces the item set/row info by a data.frame. -
dimnames(itemMatrix)
: returns a list with the dimname vectors. -
dimnames(x = itemMatrix) <- value
: replace the dimnames.
Slots
data
a sparse matrix of class ngCMatrix representing the itemsets. Warning: the matrix is stored in transposed form for efficiency reasons!.
itemInfo
a data.frame
itemsetInfo
a data.frame
Objects from the Class
Objects can be created by calls of the form
new("itemMatrix", ...)
. However, most of the time objects will be
created by coercion from a matrix, list or data.frame.
Coercions
-
as("matrix", "itemMatrix")
-
as("itemMatrix", "matrix")
-
as("list", "itemMatrix")
-
as("itemMatrix", "list")
-
as("itemMatrix", "ngCMatrix")
-
as("ngCMatrix", "itemMatrix")
Warning: the ngCMatrix
representation is transposed!
Author(s)
Michael Hahsler
See Also
Other itemMatrix and transactions functions:
abbreviate()
,
crossTable()
,
c()
,
duplicated()
,
extract
,
hierarchy
,
image()
,
inspect()
,
is.superset()
,
itemFrequencyPlot()
,
itemFrequency()
,
match()
,
merge()
,
random.transactions()
,
sample()
,
sets
,
size()
,
supportingTransactions()
,
tidLists-class
,
transactions-class
,
unique()
Examples
set.seed(1234)
## Generate a logical matrix with 5000 random itemsets for 20 items
m <- matrix(runif(5000 * 20) > 0.8, ncol = 20,
dimnames = list(NULL, paste("item", c(1:20), sep = "")))
head(m)
## Coerce the logical matrix into an itemMatrix object
imatrix <- as(m, "itemMatrix")
imatrix
## An itemMatrix contains a set of itemsets (each row is an itemset).
## The length of the set is the number of rows.
length(imatrix)
## The sparese matrix also has regular matrix dimensions.
dim(imatrix)
nrow(imatrix)
ncol(imatrix)
## Subsetting: Get first 5 elements (rows) of the itemMatrix. This can be done in
## several ways.
imatrix[1:5] ### get elements 1:5
imatrix[1:5, ] ### Matrix subsetting for rows 1:5
head(imatrix, n = 5) ### head()
## Get first 5 elements (rows) of the itemMatrix as list.
as(imatrix[1:5], "list")
## Get first 5 elements (rows) of the itemMatrix as matrix.
as(imatrix[1:5], "matrix")
## Get first 5 elements (rows) of the itemMatrix as sparse ngCMatrix.
## **Warning:** For efficiency reasons, the ngCMatrix is transposed! You
## can transpose it again to get the expected format.
as(imatrix[1:5], "ngCMatrix")
t(as(imatrix[1:5], "ngCMatrix"))
## Get labels for the first 5 itemsets (first default and then with
## custom formating)
labels(imatrix[1:5])
labels(imatrix[1:5], itemSep = " + ", setStart = "", setEnd = "")
## Create itemsets manually from an itemMatrix. Itemsets contain items in the form of
## an itemMatrix and additional quality measures (not supplied in the example).
is <- new("itemsets", items = imatrix)
is
inspect(head(is, n = 3))
## Create rules manually. I use imatrix[4:6] for the lhs of the rules and
## imatrix[1:3] for the rhs. Rhs and lhs cannot share items so I use
## itemSetdiff here. I also assign missing values for the quality measures support
## and confidence.
rules <- new("rules",
lhs = itemSetdiff(imatrix[4:6], imatrix[1:3]),
rhs = imatrix[1:3],
quality = data.frame(support = c(NA, NA, NA),
confidence = c(NA, NA, NA)
))
rules
inspect(rules)
## Manually create a itemMatrix with an item encoding that matches imatrix (20 items in order
## item1, item2, ..., item20)
itemset_list <- list(c("item1","item2"),
c("item3"))
imatrix_new <- encode(itemset_list, itemLabels = imatrix)
imatrix_new
compatible(imatrix_new, imatrix)