data.simulation.factors {varclust}R Documentation

Simulates subspace clustering data with shared factors

Description

Generating data for simulation with a low-rank subspace structure: variables are clustered and each cluster has a low-rank representation. Factors that span subspaces are shared between clusters.

Usage

data.simulation.factors(n = 100, SNR = 1, K = 10, numb.vars = 30,
  numb.factors = 10, min.dim = 1, max.dim = 2, equal.dims = TRUE,
  separation.parameter = 0.1)

Arguments

n

An integer, number of individuals.

SNR

A numeric, signal to noise ratio measured as variance of the variable, element of a subspace, to the variance of noise.

K

An integer, number of subspaces.

numb.vars

An integer, number of variables in each subspace.

numb.factors

An integer, number of factors from which subspaces basis will be drawn.

min.dim

An integer, minimal dimension of subspace .

max.dim

An integer, if equal.dims is TRUE then max.dim is dimension of each subspace. If equal.dims is FALSE then subspaces dimensions are drawn from uniform distribution on [min.dim,max.dim].

equal.dims

A boolean, if TRUE (value set by default) all clusters are of the same dimension.

separation.parameter

a numeric, coefficients of variables in each subspace basis are drawn from range [separation.parameter,1]

Value

A list consisting of:

X

matrix, generated data

signals

matrix, data without noise

factors

matrix, columns of which span subspaces

indices

list of vectors, indices of factors that span subspaces

dims

vector, dimensions of subspaces

s

vector, true partiton of variables

Examples

sim.data <- data.simulation.factors()
sim.data2 <- data.simulation.factors(n = 30, SNR = 2, K = 5, numb.vars = 20,
             numb.factors = 10, max.dim = 3, equal.dims = FALSE, separation.parameter = 0.2)

[Package varclust version 0.9.4 Index]