MTL_reg {HMTL}R Documentation

Robust Multi-Task Feature Learning

Description

MTL_reg conducts multi-tasks feature learning to the learning tasks with continous response variables, such as the linear regression, Huber regression, adaptive Huber. The adaptive Huber method is based on Sun, Q., Zhou, W.-X. and Fan, J. (2020) and Wang, L., Zheng, C., Zhou, W. and Zhou, W.-X. (2021). The penalty function applies a mixed \ell_{2,1} norm to combine regression coefficients of predictor shared across all tasks. The Huber regression and adaptive Huber regression need the robustification parameter \tau_k to strike a balance between the unbiasedness and robustness, and the adaptive method can determine this parameter by a tuning-free principle.

Usage

MTL_reg(
  y,
  x,
  lambda,
  Kn,
  p,
  n,
  beta = 0.1,
  tau = 1.45,
  Cont_Model = "adaptive Huber",
  import_w = 1,
  tol = 0.05,
  max_iter = 100,
  Complete = "True",
  diagnostics = FALSE,
  gamma = 1,
  alpha = 1
)

Arguments

y

List. A list of continuous responses vectors for all tasks.

x

List. Listing matrices of the predictors for all tasks align with the same order as in y.

lambda

Numeric. The penalty parameter used for block-wise regularization (\ell_{2,1} norm).

Kn

Numeric. The number of tasks with continuous responses.

p

Numeric. The number of features.

n

Numeric or vector. If only one numeric value is provided, equal sample size will be assumed for each task. If a vector is provided, then the elements are the sample sizes for all tasks.

beta

(optional). Numeric or matrix. An initial value or matrix of values p by K for the estimation. The default value is 0.1.

tau

Numeric or vector. The robustification parameter used for methods "Huber regression" or "Adaptive Huber". The default value is 1.45.

Cont_Model

Character("regression", "Huber regression", or "adaptive Huber"). The models used for tasks with continuous responses.

import_w

Numeric or vector. The weights assigned to different tasks. An equal weight is set as the default.

tol

(optional). Numeric. The tolerance level of optimation.

max_iter

(optional). Numeric. The maximum number of iteration steps.

Complete

Logic input. If the predictors in each task are all measured, set 'Complete == TRUE'; If some predictors in some but not all task are all measured, set'Complete == FALSE', and the missing values are imputed by column mean. The adjustment weights will be assigned based on the completeness of the predictors.

diagnostics

Logic input. If 'diagnostics == TRUE', the function provides Bayesian information criterion, and the selected model performance is evalued by the MSE and MAE for tasks with continuous response and the AUC and deviance for tasks with binary responses.

gamma

(optional). Numeric. Step size for each inner iteration. The default is equal to 1.

alpha

(optional). Numeric. A tuning parameter for BIC penalty. The default is equal to 1.

Value

A list including the following terms will be returned:

beta

A p by K matrix of estimated sparse parameters.

Task type

The models used in each task.

Task weights

The weights assigned to each task.

Selected_List

The index of non-zero parameters.

If 'diagnostics = TRUE', the following terms will be returned:

Bayesian_Information

Table of the information criterion: Composite likelihood, Degree of freedom, and (peudo or robust) Bayesian informtion criterion.

Reg_Error

Table of the model performance for (Huber) regressions: the mean square error (MSE), and the mean absolute error (MAE).

Residuals

The residuals for all tasks.

References

Huber, P. J. (1964). Robust estimation of a location parameter. Ann. Math. Statist., 35, 73–101.

Sun, Q., Zhou, W.-X. and Fan, J. (2020). Adaptive Huber regression. J. Amer. Statist. Assoc., 115, 254-265.

Wang, L., Zheng, C., Zhou, W. and Zhou, W.-X. (2021). A new principle for tuning-free Huber regression. Stat. Sinica, 31, 2153-2177.

Zhong, Y., Xu, W., and Gao X., (2023) Robust Multi-task Feature Learning. Submitted

Examples

x_reg <- list(mockdata1[[1]],mockdata1[[2]])
y_reg <- list(mockdata2[[1]],mockdata2[[2]])
model <- MTL_reg(y_reg,x_reg, lambda = 2.5  , Kn = 2, p=500,
                n = c(500,250 ),gamma = 2, Complete = FALSE, diagnostics = TRUE, alpha = 2)

# Selected non-zero coefficients
model$beta[model$Selected_List,]
# Estimated Pseudo-BIC
model$Bayesian_Information
# Regression error
model$Reg_Error

[Package HMTL version 0.1.0 Index]