DebiasProgCV {DebiasInfer}R Documentation

The proposed debiasing (primal) program with cross-validation.

Description

This function implements our proposed debiasing program that selects the tuning parameter "\gamma/n" by cross-validation and returns the final debiasing weights.

Usage

DebiasProgCV(X, x, prop_score, gamma_lst = NULL, cv_fold = 5, cv_rule = "1se")

Arguments

X

The input design n*d matrix.

x

The current query point, which is a 1*d array.

prop_score

An n-dim numeric vector with (estimated) propensity scores as its entries.

gamma_lst

A numeric vector with candidate values for the regularization parameter "\gamma/n". (Default: gamma_lst=NULL. Then, gamma_lst contains 41 equally spacing value between 0.001 and max(abs(x)).)

cv_fold

The number of folds for cross-validation on the dual program. (Default: cv_fold=5.)

cv_rule

The criteria/rules for selecting the final value of the regularization parameter "\gamma/n" in the dual program. (Default: cv_rule="1se". The candidate choices include "1se", "minfeas", and "mincv".)

Value

A list that contains three elements.

w_obs

The final estimated weights by our debiasing program.

ll_obs

The final value of the solution to our debiasing dual program.

gamma_n_opt

The final value of the tuning parameter "\gamma/n" selected by cross-validation.

Author(s)

Yikun Zhang, yikunzhang@foxmail.com

References

Zhang, Y., Giessing, A. and Chen, Y.-C. (2023) Efficient Inference on High-Dimensional Linear Model with Missing Outcomes. https://arxiv.org/abs/2309.06429.

Examples


  require(MASS)
  require(glmnet)
  d = 1000
  n = 900

  Sigma = array(0, dim = c(d,d)) + diag(d)
  rho = 0.1
  for(i in 1:(d-1)){
    for(j in (i+1):d){
      if ((j < i+6) | (j > i+d-6)){
        Sigma[i,j] = rho
        Sigma[j,i] = rho
      }
    }
  }
  sig = 1

  ## Current query point
  x_cur = rep(0, d)
  x_cur[c(1, 2, 3, 7, 8)] = c(1, 1/2, 1/4, 1/2, 1/8)
  x_cur = array(x_cur, dim = c(1,d))

  ## True regression coefficient
  s_beta = 5
  beta_0 = rep(0, d)
  beta_0[1:s_beta] = sqrt(5)

  ## Generate the design matrix and outcomes
  X_sim = mvrnorm(n, mu = rep(0, d), Sigma)
  eps_err_sim = sig * rnorm(n)
  Y_sim = drop(X_sim %*% beta_0) + eps_err_sim

  obs_prob = 1 / (1 + exp(-1 + X_sim[, 7] - X_sim[, 8]))
  R_sim = rep(1, n)
  R_sim[runif(n) >= obs_prob] = 0

  ## Estimate the propensity scores via the Lasso-type generalized linear model
  zeta = 5*sqrt(log(d)/n)/n
  lr1 = glmnet(X_sim, R_sim, family = "binomial", alpha = 1, lambda = zeta,
               standardize = TRUE, thresh=1e-6)
  prop_score = drop(predict(lr1, newx = X_sim, type = "response"))

  ## Estimate the debiasing weights with the tuning parameter selected by cross-validations.
  deb_res = DebiasProgCV(X_sim, x_cur, prop_score, gamma_lst = c(0.1, 0.5, 1),
                         cv_fold = 5, cv_rule = '1se')



[Package DebiasInfer version 0.2 Index]