data.simulation {varclust} | R Documentation |
Simulates subspace clustering data
Description
Generates data for simulation with a low-rank subspace structure: variables are clustered and each cluster has a low-rank representation. Factors than span subspaces are not shared between clusters.
Usage
data.simulation(n = 100, SNR = 1, K = 10, numb.vars = 30,
max.dim = 2, min.dim = 1, equal.dims = TRUE)
Arguments
n |
An integer, number of individuals. |
SNR |
A numeric, signal to noise ratio measured as variance of the variable, element of a subspace, to the variance of noise. |
K |
An integer, number of subspaces. |
numb.vars |
An integer, number of variables in each subspace. |
max.dim |
An integer, if equal.dims is TRUE then max.dim is dimension of each subspace. If equal.dims is FALSE then subspaces dimensions are drawn from uniform distribution on [min.dim,max.dim]. |
min.dim |
An integer, minimal dimension of subspace . |
equal.dims |
A boolean, if TRUE (value set by default) all clusters are of the same dimension. |
Value
A list consisting of:
X |
matrix, generated data |
signals |
matrix, data without noise |
dims |
vector, dimensions of subspaces |
factors |
matrix, columns of which span subspaces |
s |
vector, true partiton of variables |
Examples
sim.data <- data.simulation()
sim.data2 <- data.simulation(n = 30, SNR = 2, K = 5, numb.vars = 20,
max.dim = 3, equal.dims = FALSE)