create.MCCCAdata {mccca}R Documentation

this function creates a list (class: mcccadata) to be applied to MCCCA.

Description

Creates a list (named mcccadata.list) applied to MCCCA.

Usage

create.MCCCAdata(dat,ext.mat=ext.mat,clstr0.vec=NULL)

Arguments

dat

An (NxJ) matrix of categorical data (N:the number of observations, J:the number of variables). If rownames(dat) is NULL, c(obj1,..,objN) are defined as rownames(dat).

ext.mat

An (NxH) external variable matrix (H:the number of external variable).

clstr0.vec

An integer vector of length N giving each observation's true cluster.

Value

Returns a list with the following elements.

data.mat

data matrix same as dat.

data.list

A list of C (NxJ) categorical data matrices for each class (C:the number of classes).

clstr0.list

A list of C vectors where each vector indicates the true cluster (given in clstr0.vec) to which each class of observations belongs (NULL if clstr0.vec is NULL).

N.vec

A vector of length C giving the number of observations in each class.

Ktrue.vec

A vector of length C giving the true number of clusters in each class (NULL if clstr0.vec is NULL).

q.vec

A vector of length J giving the number of categories in each of J categorical variables.

class.n.vec

An integer (from 1:C) vector of length N giving the class index of each observation. names(class.n.vec)=rownames(dat).

classname.n.vec

A characteristic vector of length N giving the class label each observation belongs to. names(classname.n.vec)=rownames(dat).

classlabel

A characteristic vector of length C giving the classlabel for each class.

classlab.mat

(Cx(H+1)) table, showing which combinations of categories of external variables each class index and class name corresponds to. The first H columns indicate the categories for each of the H external variables, and the last H+1th column indicates the corresponding class label (same as classlabel).

oriindex.list

A list of length C, where each list element corresponds to a row (observation) in data.list, indicating which row of observations (in data.mat) each observation (in oriindex.list) corresponds to.

References

Takagishi & Michel van de Velden (2022): Visualizing Class Specific Heterogeneous Tendencies in Categorical Data, Journal of Computational and Graphical Statistics, DOI: 10.1080/10618600.2022.2035737

Examples

#setting
N <- 100 ; J <- 5 ; Ktrue <- 2 ; q.vec <- rep(5,J) ; noise.prop <- 0.2
extcate.vec=c(2,3)#the number of categories for each external variable

#generate categorical variable data
catedata.list <- generate.onedata(N=N,J=J,Ktrue=Ktrue,q.vec=q.vec,noise.prop = noise.prop)
data.cate=catedata.list$data.mat
clstr0.vec=catedata.list$clstr0.vec

#generate external variable data
data.ext=generate.ext(N,extcate.vec=extcate.vec)

#create mccca.list to be applied to MCCCA function
mccca.data=create.MCCCAdata(data.cate,ext.mat=data.ext,clstr0.vec =clstr0.vec)

#check which class each observation belongs to. (given by class name)
mccca.data$classname.n.vec

#A table showing that which combinations of categories of external variables
# each class index and class name corresponds to.
mccca.data$classlab.mat

[Package mccca version 1.1.0.1 Index]