HMTL-package {HMTL} | R Documentation |
Heterogeneous Multi-task Feature Learning
Description
HMTL
package implements the block-wise sparse estimation by grouping the coefficients of related predictors across multiple tasks. The tasks can be either regression, Huber regression, adaptive Huber regression, and logistic regression, which provide a wide variety of data types for the integration. The robust methods, such as the Huber regression and adaptive Huber regression, can deal with outlier contamination based on Sun, Q., Zhou, W.-X. and Fan, J. (2020), and Wang, L., Zheng, C., Zhou, W. and Zhou, W.-X. (2021). The model selection applies the modified form of Bayesian information criterion to measure the mdoel performance, which has similar formulation as previous work developed by Gao, X., and Carroll, R. J., (2017).
Details
In the context of multi-task learning, there are different data sets obtained from
related sources. The data sets can be modeled by different types of learning tasks based on the data distributions. Let the candidate features be denoted as
. When the integrated data sets have different measurements, we assume the predictors to share some similarities. For example, the
th predictors collected as
in the table below represent the same type of feature in all related studies. In some cases, the tasks can share same set of predictor, then
.
Tasks | Formula | | | | | | |
1 | | | | | | | |
2 | | | | | | | |
... | |||||||
K | | | | | | | |
The coefficients can be grouped as the vector for the feature
.
Platforms | | |
|
1 | | |
|
2 | | |
|
... | ... | ||
k | |
|
The heterogeneous multi-task feature learning HMTL
can select significant features through the overall objective function:
The loss function is defined as , which can be the composite quasi-likelihood or the composite form of (adaptive) Huber loss with additional robustification parameter
. The penalty function is the mixed
regularization, such that
.
This package also contains functions to provide the Bayesian information criterion:
with denoting the composite quasi-likelihood or adaptive Huber loss,
measuring the model complexity and
being the penalty on the model complexity.
In this package, the function MTL_reg
deals with regression tasks, which can be outlier contaminated. The function MTL_class
is applied to model multiple classification tasks, and the function MTL_hetero
can integrate different types of tasks together.
Author(s)
Yuan Zhong, Wei Xu, and Xin Gao
Maintainer: Yuan Zhong <aqua.zhong@gmail.com>
References
Zhong, Y., Xu, W., and Gao X., (2023) Heterogeneous multi-task feature learning with mixed regularization. Submitted
Zhong, Y., Xu, W., and Gao X., (2023) Robust Multi-task Feature Learning. Submitted
Gao, X., and Carroll, R. J., (2017) Data integration with high dimensionality. Biometrika, 104, 2, pp. 251-272
Huber, P. J. (1964). Robust estimation of a location parameter. Ann. Math. Statist., 35, 73–101.
Sun, Q., Zhou, W.-X. and Fan, J. (2020). Adaptive Huber regression. J. Amer. Statist. Assoc., 115, 254-265.
Wang, L., Zheng, C., Zhou, W. and Zhou, W.-X. (2021). A new principle for tuning-free Huber regression. Stat. Sinica, 31, 2153-2177.