R: Simulate and Analyze LSBCLUST

sim_lsbclust {lsbclust}

R Documentation

Simulate and Analyze LSBCLUST

Description

Perform a single simulation run for the LSBCLUST model. Multiple data sets are generated for a single set of underlying parameters,

Usage

sim_lsbclust(ndata, nobs, size, nclust, clustsize = NULL,
  delta = rep(1L, 4L), ndim = 2L, alpha = 0.5, fixed = c("none",
  "rows", "columns"), err_sd = 1, svmins = 0.5, svmax = 5,
  seed = NULL, parallel = FALSE, parallel_data = TRUE, verbose = 0,
  nstart_T3 = 20L, nstart_ak = 20L, mc.cores = detectCores() - 1,
  include_fits = FALSE, include_data = FALSE, nstart, nstart.kmeans)

Arguments

`ndata`	Integer giving the number of data sets to generate with the same underlying parameters.
`nobs`	Integer giving the number of observations to sample.
`size`	Vector with two elements giving the number of rows and columns respectively of each simulated observation.
`nclust`	A vector of length four giving the number of clusters for the overall mean, the row margins, the column margins and the interactions (in that order) respectively. Alternatively, a vector of length one, in which case all components will have the same number of clusters.
`clustsize`	A list of length four, with each element containing a vector of the same length as the corresponding entry in `nclust`, indicating the number of elements to contribute to each sample. Naturally, each of these vectors must sum to `nobs`, or an error will result. Positional matching are used, in the order "overall", "rows", "columns" and "interactions". If `NULL`, all clusters will be of equal size.
`delta`	A four-element binary vector (logical or numeric) indicating which sum-to-zero constraints must be enforced.
`ndim`	The required rank for the approximation of the interactions (a scalar).
`alpha`	Numeric value in [0, 1] which determines how the singular values are distributed between rows and columns (passed to `int.lsbclust`).
`fixed`	One of `"none"`, `"rows"` or `"columns"` indicating whether to fix neither sets of coordinates, or whether to fix the row or column coordinates across clusters respectively. If a vector is supplied, only the first element will be used (passed to `int.lsbclust`).
`err_sd`	The standard deviation of the error distribution, as passed to `rnorm`
`svmins`	Vector of minimum values for the singular values (as passed to `simsv`). Optionally, if all minima are equal, a single numeric value which will be expanded to the correct length.
`svmax`	The maximum possible singular value (as passed to `simsv`)
`seed`	An optional seed to be set for the random number generator
`parallel`	Logical indicating whether to parallelize over random starts. Note that `parallel_data` has precedence over this
`parallel_data`	Logical indicating whether to parallelize over the data sets. If `FALSE`, parallelization is done over random starts (depending on `parallel`).
`verbose`	Integer giving the number of iterations after which the loss values is printed.
`nstart_T3`	The number of random starts to use for `T3Clusf`
`nstart_ak`	The number of random starts to use for `akmeans`
`mc.cores`	The number of cores to use, passed to `makeCluster`
`include_fits`	Logical indicating whether to include the model fits, or or only the fit statistics
`include_data`	Logical indicating whether to include the simulated data fitted on, or only the results
`nstart`	From `lsbclust`
`nstart.kmeans`	From `lsbclust`

Examples

set.seed(1)
res <- sim_lsbclust(ndata = 5, nobs = 100, size = c(10, 8), nclust = rep(5, 4), 
                    verbose = 0, nstart_T3 = 2, nstart_ak = 1, parallel_data = FALSE,
                    nstart = 2, nstart.kmeans = 5 )

[Package lsbclust version 1.1 Index]