data.simulation.factors {varclust} | R Documentation |
Simulates subspace clustering data with shared factors
Description
Generating data for simulation with a low-rank subspace structure: variables are clustered and each cluster has a low-rank representation. Factors that span subspaces are shared between clusters.
Usage
data.simulation.factors(n = 100, SNR = 1, K = 10, numb.vars = 30,
numb.factors = 10, min.dim = 1, max.dim = 2, equal.dims = TRUE,
separation.parameter = 0.1)
Arguments
n |
An integer, number of individuals. |
SNR |
A numeric, signal to noise ratio measured as variance of the variable, element of a subspace, to the variance of noise. |
K |
An integer, number of subspaces. |
numb.vars |
An integer, number of variables in each subspace. |
numb.factors |
An integer, number of factors from which subspaces basis will be drawn. |
min.dim |
An integer, minimal dimension of subspace . |
max.dim |
An integer, if equal.dims is TRUE then max.dim is dimension of each subspace. If equal.dims is FALSE then subspaces dimensions are drawn from uniform distribution on [min.dim,max.dim]. |
equal.dims |
A boolean, if TRUE (value set by default) all clusters are of the same dimension. |
separation.parameter |
a numeric, coefficients of variables in each subspace basis are drawn from range [separation.parameter,1] |
Value
A list consisting of:
X |
matrix, generated data |
signals |
matrix, data without noise |
factors |
matrix, columns of which span subspaces |
indices |
list of vectors, indices of factors that span subspaces |
dims |
vector, dimensions of subspaces |
s |
vector, true partiton of variables |
Examples
sim.data <- data.simulation.factors()
sim.data2 <- data.simulation.factors(n = 30, SNR = 2, K = 5, numb.vars = 20,
numb.factors = 10, max.dim = 3, equal.dims = FALSE, separation.parameter = 0.2)