data_generation {mtlgmm} | R Documentation |
Generate data for simulations.
Description
Generate data for simulations. All models used in Tian, Y., Weng, H., & Feng, Y. (2022)) are implemented.
Usage
data_generation(
K = 10,
outlier_K = 1,
simulation_no = c("MTL-1", "MTL-2"),
h_w = 0.1,
h_mu = 1,
n = 50
)
Arguments
K |
the number of tasks (data sets). Default: 10 |
outlier_K |
the number of outlier tasks. Default: 1 |
simulation_no |
simulation number in Tian, Y., Weng, H., & Feng, Y. (2022)). Can be "MTL-1", "MTL-2". Default = "MTL-1". |
h_w |
the value of h_w. Default: 0.1 |
h_mu |
the value of h_mu. Default: 1 |
n |
the sample size of each task. Can be either an positive integer or a vector of length |
Value
a list of two sub-lists "data" and "parameter". List "data" contains a list of design matrices x
, a list of hidden labels y
, and a vector of outlier task indices outlier_index
. List "parameter" contains a vector w
of mixture proportions, a matrix mu1
of which each column is the GMM mean of the first cluster of each task, a matrix mu2
of which each column is the GMM mean of the second cluster of each task, a matrix beta
of which each column is the discriminant coefficient in each task, a list Sigma
of covariance matrices for each task.
References
Tian, Y., Weng, H., & Feng, Y. (2022). Unsupervised Multi-task and Transfer Learning on Gaussian Mixture Models. arXiv preprint arXiv:2209.15224.
See Also
mtlgmm
, tlgmm
, predict_gmm
, initialize
, alignment
, alignment_swap
, estimation_error
, misclustering_error
.
Examples
data_list <- data_generation(K = 5, outlier_K = 1, simulation_no = "MTL-1", h_w = 0.1,
h_mu = 1, n = 50)