R: Parameter-transfer learning for partially linear models based...

trans.smap {matrans}

R Documentation

Parameter-transfer learning for partially linear models based on semiparametric model averaging.

Description

Obtain optimal weights and estimated coefficients based on Trans-SMAP.

Usage

trans.smap(train.data, nfold = NULL, bs.para, lm.set = NULL)

Arguments

`train.data`	a list containing the observations of predictors and response for fitting models. Should be a list with elements "data.y", "data.x" and "data.z", where "data.y" indicates a response list for all data sources, "data.x" indicates a parametric predictor list for all data sources, and "data.z" indicates a nonparametric predictor list for all data sources. Each element in "data.x" and "data.z" is a matrix with each row as an observation and each column as a variable. By default, the first element in "data.y", "data.x" and "data.z" is target data, and others are source data.
`nfold`	the number of folds for the cross-validation weight criterion. Default is NULL (leave-one-out).
`bs.para`	a list containing the parameters for B-spline construction in function `bs`. Should be a list with elements "bs.df" and "bs.degree", each component of which is a vector with the same length as the number of nonparametric variables. For example, bs.para = list(bs.df=c(3,3,3), bs.degree=c(3,3,3)). "bs.df": degrees of freedom for each nonparametric component; The details can be referred to the arguments in function `bs`. "bs.degree": degree of the piecewise polynomial for each nonparametric component; The default is 3 for cubic splines.
`lm.set`	the vector of indices for the linear regression models, which means the corresponding models are constructed by ordinary linear models instead of partially linear models. Default is NULL.

Value

a result list containing the estimated weight vector, the execution time of solving the optimal weights and the summarized results of fitting models.

References

Hu, X., & Zhang, X. (2023). Optimal Parameter-Transfer Learning by Semiparametric Model Averaging. Journal of Machine Learning Research, 24(358), 1-53.

Examples

## correct target model setting

# generate simulation dataset
coeff0 <- cbind(
  as.matrix(c(1.4, -1.2, 1, -0.8, 0.65, 0.3)),
  as.matrix(c(1.4, -1.2, 1, -0.8, 0.65, 0.3) + 0.02),
  as.matrix(c(1.4, -1.2, 1, -0.8, 0.65, 0.3) + 0.3),
  as.matrix(c(1.4, -1.2, 1, -0.8, 0.65, 0.3))
)
whole.data <- simdata.gen(
  px = 6, num.source = 4, size = c(150, 200, 200, 150), coeff0 = coeff0,
  coeff.mis = as.matrix(c(coeff0[, 2], 1.8)), err.sigma = 0.5, rho = 0.5, size.test = 500,
  sim.set = "homo", tar.spec = "cor", if.heter = FALSE
)
data.train <- whole.data$data.train
data.test <- whole.data$data.test

# running Trans-SMAP and obtain the optimal weight vector
data.train$data.x[[2]] <- data.train$data.x[[2]][, -7]
fit.transsmap <- trans.smap(
  train.data = data.train, nfold = 5,
  bs.para = list(bs.df = rep(3, 3), bs.degree = rep(3, 3))
)
ma.weights <- fit.transsmap$weight.est


## misspecified target model setting

# generate simulation dataset
coeff.mis <- matrix(c(c(coeff0[, 1], 0.1), c(coeff0[, 2], 1.8)), ncol = 2)
whole.data <- simdata.gen(
  px = 6, num.source = 4, size = c(150, 200, 200, 150), coeff0 = coeff0,
  coeff.mis = coeff.mis, err.sigma = 0.5, rho = 0.5, size.test = 500,
  sim.set = "homo", tar.spec = "mis", if.heter = FALSE
)
data.train <- whole.data$data.train
data.test <- whole.data$data.test

# running Trans-SMAP and obtain the optimal weight vector
data.train$data.x[[1]] <- data.train$data.x[[1]][, -7]
data.train$data.x[[2]] <- data.train$data.x[[2]][, -7]
fit.transsmap <- trans.smap(
  train.data = data.train, nfold = 5,
  bs.para = list(bs.df = rep(3, 3), bs.degree = rep(3, 3))
)
ma.weights <- fit.transsmap$weight.est

[Package matrans version 0.1.0 Index]