tlgmm {mtlgmm} | R Documentation |
Fit the binary Gaussian mixture model (GMM) on target data set by leveraging multiple source data sets under a transfer learning (TL) setting.
Description
Fit the binary Gaussian mixture model (GMM) on target data set by leveraging multiple source data sets under a transfer learning (TL) setting. This function implements the modified EM algorithm (Altorithm 4) proposed in Tian, Y., Weng, H., & Feng, Y. (2022).
Usage
tlgmm(
x,
fitted_bar,
step_size = c("lipschitz", "fixed"),
eta_w = 0.1,
eta_mu = 0.1,
eta_beta = 0.1,
lambda_choice = c("fixed", "cv"),
cv_nfolds = 5,
cv_upper = 2,
cv_lower = 0.01,
cv_length = 5,
C1_w = 0.05,
C1_mu = 0.2,
C1_beta = 0.2,
C2_w = 0.05,
C2_mu = 0.2,
C2_beta = 0.2,
kappa0 = 1/3,
tol = 1e-05,
initial_method = c("kmeans", "EM"),
iter_max = 1000,
iter_max_prox = 100,
ncores = 1
)
Arguments
x |
design matrix of the target data set. Should be a |
fitted_bar |
the output from |
step_size |
step size choice in proximal gradient method to solve each optimization problem in the revised EM algorithm (Algorithm 1 in Tian, Y., Weng, H., & Feng, Y. (2022)), which can be either "lipschitz" or "fixed". Default = "lipschitz".
|
eta_w |
step size in the proximal gradient method to learn w (Step 3 of Algorithm 4 in Tian, Y., Weng, H., & Feng, Y. (2022)). Default: 0.1. Only used when |
eta_mu |
step size in the proximal gradient method to learn mu (Steps 4 and 5 of Algorithm 4 in Tian, Y., Weng, H., & Feng, Y. (2022)). Default: 0.1. Only used when |
eta_beta |
step size in the proximal gradient method to learn beta (Step 7 of Algorithm 4 in Tian, Y., Weng, H., & Feng, Y. (2022)). Default: 0.1. Only used when |
lambda_choice |
the choice of constants in the penalty parameter used in the optimization problems. See Algorithm 4 of Tian, Y., Weng, H., & Feng, Y. (2022), which can be either "fixed" or "cv". Default = "cv".
|
cv_nfolds |
the number of cross-validation folds. Default: 5 |
cv_upper |
the upper bound of |
cv_lower |
the lower bound of |
cv_length |
the number of |
C1_w |
the initial value of C1_w. See equations (19) in Tian, Y., Weng, H., & Feng, Y. (2022). Default: 0.05 |
C1_mu |
the initial value of C1_mu. See equations (20) in Tian, Y., Weng, H., & Feng, Y. (2022). Default: 0.2 |
C1_beta |
the initial value of C1_beta. See equations (21) in Tian, Y., Weng, H., & Feng, Y. (2022). Default: 0.2 |
C2_w |
the initial value of C2_w. See equations (22) in Tian, Y., Weng, H., & Feng, Y. (2022). Default: 0.05 |
C2_mu |
the initial value of C2_mu. See equations (23) in Tian, Y., Weng, H., & Feng, Y. (2022). Default: 0.2 |
C2_beta |
the initial value of C2_beta. See equations (24) in Tian, Y., Weng, H., & Feng, Y. (2022). Default: 0.2 |
kappa0 |
the decaying rate used in equation (19)-(24) in Tian, Y., Weng, H., & Feng, Y. (2022). Default: 1/3 |
tol |
maximum tolerance in all optimization problems. If the difference between last update and the current update is less than this value, the iterations of optimization will stop. Default: 1e-05 |
initial_method |
initialization method. This indicates the method to initialize the estimates of GMM parameters for each data set. Can be either "kmeans" or "EM". |
iter_max |
the maximum iteration number of the revised EM algorithm (i.e. the parameter T in Algorithm 1 in Tian, Y., Weng, H., & Feng, Y. (2022)). Default: 1000 |
iter_max_prox |
the maximum iteration number of the proximal gradient method. Default: 100 |
ncores |
the number of cores to use. Parallel computing is strongly suggested, specially when |
Value
A list with the following components.
w |
the estimate of mixture proportion in GMMs for the target task. Will be a vector. |
mu1 |
the estimate of Gaussian mean in the first cluster of GMMs for the target task. Will be a matrix, where each column represents the estimate for a task. |
mu2 |
the estimate of Gaussian mean in the second cluster of GMMs for the target task. Will be a matrix, where each column represents the estimate for a task. |
beta |
the estimate of the discriminant coefficient for the target task. Will be a matrix, where each column represents the estimate for a task. |
Sigma |
the estimate of the common covariance matrix for the target task. Will be a list, where each component represents the estimate for a task. |
C1_w |
the initial value of C1_w. |
C1_mu |
the initial value of C1_mu. |
C1_beta |
the initial value of C1_beta. |
C2_w |
the initial value of C2_w. |
C2_mu |
the initial value of C2_mu. |
C2_beta |
the initial value of C2_beta. |
References
Tian, Y., Weng, H., & Feng, Y. (2022). Unsupervised Multi-task and Transfer Learning on Gaussian Mixture Models. arXiv preprint arXiv:2209.15224.
Parikh, N., & Boyd, S. (2014). Proximal algorithms. Foundations and trends in Optimization, 1(3), 127-239.
See Also
mtlgmm
, predict_gmm
, data_generation
, initialize
, alignment
, alignment_swap
, estimation_error
, misclustering_error
.
Examples
set.seed(0, kind = "L'Ecuyer-CMRG")
## Consider a transfer learning problem with 3 source tasks and 1 target task in the setting "MTL-1"
data_list_source <- data_generation(K = 3, outlier_K = 0, simulation_no = "MTL-1", h_w = 0,
h_mu = 0, n = 50) # generate the source data
data_target <- data_generation(K = 1, outlier_K = 0, simulation_no = "MTL-1", h_w = 0.1,
h_mu = 1, n = 50) # generate the target data
fit_mtl <- mtlgmm(x = data_list_source$data$x, C1_w = 0.05, C1_mu = 0.2, C1_beta = 0.2,
C2_w = 0.05, C2_mu = 0.2, C2_beta = 0.2, kappa = 1/3, initial_method = "EM",
trim = 0.1, lambda_choice = "fixed", step_size = "lipschitz")
fit_tl <- tlgmm(x = data_target$data$x[[1]], fitted_bar = fit_mtl, C1_w = 0.05,
C1_mu = 0.2, C1_beta = 0.2, C2_w = 0.05, C2_mu = 0.2, C2_beta = 0.2, kappa0 = 1/3,
initial_method = "EM", ncores = 1, lambda_choice = "fixed", step_size = "lipschitz")
# use cross-validation to choose the tuning parameters
# warning: can be quite slow, large "ncores" input is suggested!!
fit_tl <- tlgmm(x = data_target$data$x[[1]], fitted_bar = fit_mtl, kappa0 = 1/3,
initial_method = "EM", ncores = 2, lambda_choice = "cv", step_size = "lipschitz")