Datasets {stepmixr} | R Documentation |
Series of function to simulate data.
Description
These functions generates data with multiple groups using different distributions and optionnaly adding a level of missing value.
Usage
random_nan(X, Y, nan_ratio, random_state=NULL)
bakk_measurements(n_classes, n_mm, sep_level)
data_bakk_response(n_samples, sep_level, n_classes = 3, n_mm = 6, random_state = NULL)
data_bakk_covariate(n_samples, sep_level, n_mm = 6, random_state = NULL)
data_bakk_complete(n_samples, sep_level, n_mm=6, random_state=NULL, nan_ratio=0.0)
data_generation_gaussian(n_samples, sep_level, n_mm=6, random_state=NULL)
data_gaussian_diag(n_samples, sep_level, n_mm = 6, random_state = NULL, nan_ratio = 0.0)
Arguments
X |
The X matrix or data.frame for the measurement part of the model |
Y |
The Y matrix or data.frame for the structural part of the model |
nan_ratio |
The ratio of missing values. A value between 0 and 1. |
random_state |
An integer initializing the seed of the random generator. |
n_classes |
Number of latent classes required. |
n_mm |
Number of features in the measurement model. |
sep_level |
Separation level in the measurement data. |
n_samples |
Number of samples. |
Details
These function returns simulated data used to test the package.
Value
list of data.frame simulated according to the function parameters.
Author(s)
Éric Lacourse, Roxane de la Sablonnière, Charles-Édouard Giguère, Sacha Morin, Robin Legault, Félix Laliberté, Zsusza Bakk
References
Bakk, Z. and Kuha, J. Two-step estimation of models between latent classes and external variables. Psychometrika, 83(4):871-892, 2018