Datasets {stepmixr}R Documentation

Series of function to simulate data.

Description

These functions generates data with multiple groups using different distributions and optionnaly adding a level of missing value.

Usage

random_nan(X, Y, nan_ratio, random_state=NULL)
bakk_measurements(n_classes, n_mm, sep_level)
data_bakk_response(n_samples, sep_level, n_classes = 3, n_mm = 6, random_state = NULL)
data_bakk_covariate(n_samples, sep_level, n_mm = 6, random_state = NULL)
data_bakk_complete(n_samples, sep_level, n_mm=6, random_state=NULL, nan_ratio=0.0)
data_generation_gaussian(n_samples, sep_level, n_mm=6, random_state=NULL)
data_gaussian_diag(n_samples, sep_level, n_mm = 6, random_state = NULL, nan_ratio = 0.0)

Arguments

X

The X matrix or data.frame for the measurement part of the model

Y

The Y matrix or data.frame for the structural part of the model

nan_ratio

The ratio of missing values. A value between 0 and 1.

random_state

An integer initializing the seed of the random generator.

n_classes

Number of latent classes required.

n_mm

Number of features in the measurement model.

sep_level

Separation level in the measurement data.

n_samples

Number of samples.

Details

These function returns simulated data used to test the package.

Value

list of data.frame simulated according to the function parameters.

Author(s)

Éric Lacourse, Roxane de la Sablonnière, Charles-Édouard Giguère, Sacha Morin, Robin Legault, Félix Laliberté, Zsusza Bakk

References

Bakk, Z. and Kuha, J. Two-step estimation of models between latent classes and external variables. Psychometrika, 83(4):871-892, 2018


[Package stepmixr version 0.1.2 Index]