convert {cabootcrs}R Documentation

Converting a data matrix from one format into another

Description

convert recodes a data matrix from one format, used by versions of correspondence analysis, into another (n objects by p variables, counts for distinct combinations of p variables, indicator matrix, contingency table).

Usage

convert(
  Xinput,
  input = "nbyp",
  output = "indicator",
  Jk = NULL,
  maxcat = NULL,
  varandcat = TRUE
)

Arguments

Xinput

A data matrix, in the form of a data frame or similar

input

The format of the input matrix:

"nbyp"

An n individuals/objects/data points by p categorical variables matrix, where each row is a different data point and each column contains the category for that data point on that variable, where these categories can be numbers, strings or factors

"nbypcounts"

Similar to the above, but each row represents all of the data points taking the same combination of categories, and the first column contains the count for this combination (hence the name used here is a bit of a misnomer, but it emphasises the similarities to an n by p)

"indicator"

An indicator matrix, similar to the n by p matrix except that a variable with J_k categories is represented by J_k columns and a data point taking the i-th category has 1 in the i-th of these columns and a zero in the others

"CT"

A contingency table of counts

output

The format of the output matrix:

"nbyp"

As above

"nbypcounts"

As above

"indicator"

As above

"doubled"

Similar to indicator but each variable is now represented by 2 columns, and a data point taking the i-th category for a variable with J_k categories is given the values J_k-i in the first (low) column and i-1 in the second (high) column

Jk

A list containing the number of distinct categories for each variable.
Either Jk or maxcat must be specified if input is "indicator"

maxcat

The maximum category value, for use when all variables are Likert on a scale of 1 to maxcat.
Either Jk or maxcat must be specified if input is "indicator"

varandcat

Flag for how to construct column names in an indicator matrix:

TRUE

if many variables have the same categories, e.g. Likert, column names will be varname:catname

FALSE

when variables have distinct categories, column names will just be category names

Value

A list containing:

result

the output data matrix formatted according to the output argument

varnames

a list of length p containing the names of each variable

catnames

a list/array (of length p) containing the lists (of length Jk[i]) of category names for each variable

Jk

a list of length p containing the number of distinct categories for each variable

p

the number of variables

See Also

getBurt to obtain a Burt matrix or a subset of an existing one
getCT to obtain a contingency table (only if p=2)
getindicator to obtain an indicator matrix
getdoubled to obtain a doubled matrix if all variables are ordered categorical with numbered categories

Other conversion functions: getBurt(), getCT(), getdoubled(), getindicator()

Examples

dreamdataCT <- DreamData
dreamdatanbyplist <- convert(dreamdataCT,input="CT",output="nbyp")
dreamdatanbyp <- dreamdatanbyplist$result

## Not run: 

dreamdataCTb <- table(dreamdatanbyp)
dreamdatanbypcounts <- convert(dreamdatanbyp,input="nbyp",output="nbypcounts")$result
dreamdataindicatorlist <- convert(dreamdatanbypcounts,input="nbypcounts",output="indicator")
dreamdatanbypb <- convert(dreamdataindicatorlist$result,input="indicator",
                          output="nbyp",Jk=dreamdataindicatorlist$Jk)$result

nishdatanbyp <- NishData
nishdataindicator <- convert(nishdatanbyp)$result
nishdataBurt <- t(nishdataindicator)%*%nishdataindicator


## End(Not run)


[Package cabootcrs version 2.1.0 Index]