R: Calculate asymptotic confidence intervals based on...

glmtrans_inf {glmtrans}

R Documentation

Calculate asymptotic confidence intervals based on desparsified Lasso and two-step transfer learning method.

Description

Given the point esimate of the coefficient vector from glmtrans, calculate the asymptotic confidence interval of each component. The detailed inference algorithm can be found as Algorithm 3 in the latest version of Tian, Y. and Feng, Y., 2021. The algorithm is consructed based on a modified version of desparsified Lasso (Van de Geer, S. et al, 2014; Dezeure, R. et al, 2015).

Usage

glmtrans_inf(
  target,
  source = NULL,
  family = c("gaussian", "binomial", "poisson"),
  beta.hat = NULL,
  nodewise.transfer.source.id = "all",
  cores = 1,
  level = 0.95,
  intercept = TRUE,
  ...
)

Arguments

`target`	target data. Should be a list with elements x and y, where x indicates a predictor matrix with each row/column as a(n) observation/variable, and y indicates the response vector.
`source`	source data. Should be a list with some sublists, where each of the sublist is a source data set, having elements x and y with the same meaning as in target data.
`family`	response type. Can be "gaussian", "binomial" or "poisson". Default = "gaussian". "gaussian": Gaussian distribution. "binomial": logistic distribution. When `family = "binomial"`, the input response in both `target` and `source` should be 0/1. "poisson": poisson distribution. When `family = "poisson"`, the input response in both `target` and `source` should be non-negative.
`beta.hat`	initial estimate of the coefficient vector (the intercept should be the first component). Can be from the output of function `glmtrans`.
`nodewise.transfer.source.id`	transferable source indices in the infernce (the set A in Algorithm 3 of Tian, Y. and Feng, Y., 2021). Can be either a subset of `{1, ..., length(source)}`, "all" or `NULL`. Default = `"all"`. a subset of `{1, ..., length(source)}`: only transfer sources with the specific indices. "all": transfer all sources. NULL: don't transfer any sources and only use target data.
`cores`	the number of cores used for parallel computing. Default = 1.
`level`	the level of confidence interval. Default = 0.95. Note that the level here refers to the asymptotic level of confidence interval of a single component rather than the multiple intervals.
`intercept`	whether the model includes the intercept or not. Default = TRUE. Should be set as TRUE if the intercept of `beta.hat` is not zero.
`...`	additional arguments.

Value

a list of output. b.hat = b.hat, beta.hat = beta.hat, CI = CI, var.est = var.est

`b.hat`	the center of confidence intervals. A `p`-dimensional vector, where `p` is the number of predictors.
`beta.hat`	the initial estimate of the coefficient vector (the same as input).
`CI`	confidence intervals (CIs) with the specific level. A `p` by 3 matrix, where three columns indicate the center, lower limit and upper limit of CIs, respectively. Each row represents a coefficient component.
`var.est`	the estimate of variances in the CLT (Theta transpose times Sigma times Theta, in section 2.5 of Tian, Y. and Feng, Y., 2021). A `p`-dimensional vector, where `p` is the number of predictors.

References

Tian, Y. and Feng, Y., 2021. Transfer Learning under High-dimensional Generalized Linear Models. arXiv preprint arXiv:2105.14328.

Van de Geer, S., Bühlmann, P., Ritov, Y.A. and Dezeure, R., 2014. On asymptotically optimal confidence regions and tests for high-dimensional models. The Annals of Statistics, 42(3), pp.1166-1202.

Dezeure, R., Bühlmann, P., Meier, L. and Meinshausen, N., 2015. High-dimensional inference: confidence intervals, p-values and R-software hdi. Statistical science, pp.533-558.

Examples

## Not run: 
set.seed(0, kind = "L'Ecuyer-CMRG")

# generate binomial data
D.training <- models("binomial", type = "all", K = 2, p = 200)

# fit a logistic regression model via two-step transfer learning method
fit.binomial <- glmtrans(D.training$target, D.training$source, family = "binomial")

# calculate the CI based on the point estimate from two-step transfer learning method
fit.inf <- glmtrans_inf(target = D.training$target, source = D.training$source,
family = "binomial", beta.hat = fit.binomial$beta, cores = 2)

## End(Not run)

[Package glmtrans version 2.0.0 Index]