R: Variable selection for confounders

VSE_PS {AteMeVs}

R Documentation

Variable selection for confounders

Description

This function implements the penalized quadratic loss function to select the informative confounders.

Usage

VSE_PS(V,y,method="lasso",cv="TRUE",alpha=1)

Arguments

`V`	a user-specified matrix in the quadratic loss function
`y`	a vector determined by SIMEX_EST
`method`	it specifies a choice of the penalty function with options `"lasso"` (Tibshirani 1996), `"scad"` (Fan and Li 2001) and `"mcp"` (Zhang 2010). The default is set as `method="lasso"`.
`cv`	the usage for choosing the tuning parameter. `cv="TRUE"` suggests the use of the cross-validation method, and `cv="FALSE"` allows the use of the BIC. The default is set as `cv="TRUE"`.
`alpha`	the constant appearing in the Elastic Net penalty (Zou and Hastie 2005). The default value is 1.

Details

This function is used to do variable selection for informative confounders by various choices of penalty functions.

Value

a vector of estimators in the treatment model, where components with zero values represent confounders that are unimportant and need to excluded; components with nonzero values identify important confounders that enter the treatment model.

Author(s)

Chen, L.-P. and Yi, G. Y.

References

Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96, 1348-1360.
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B, 58, 267-288.
Yi, G. Y. and Chen, L.-P. (2023). Estimation of the average treatment effect with variable selection and measurement error simultaneously addressed for potential confounders. Statistical Methods in Medical Research, 32, 691-711.
Zhang, C.-H. (2010). Nearly unbiased variable selection under minimax concave penalty. The Annals of Statistics, 38, 894-942.
Zou, H., and Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society, Series B, 67, 301-320.

Examples


library(MASS)
n = 800
p_x = 10      # dimension of parameters
p_z = 10
p = p_x + p_z
gamma_X = c(rep(1,2),rep(0,p_x-2))
gamma_Z = c(rep(1,2),rep(0,p_z-2))
gamma = c(gamma_X, gamma_Z)

mu_X = rep(0,p_x)
mu_Z = rep(0,p_z)

Sigma_X = diag(1,p_x,p_x)
Sigma_Z = diag(1,p_z,p_z)
Sigma_e = diag(0.2,p_x)
X = mvrnorm(n, mu_X, Sigma_X, tol = 1e-6, empirical = FALSE, EISPACK = FALSE)
Z = mvrnorm(n, mu_Z, Sigma_Z, tol = 1e-6, empirical = FALSE, EISPACK = FALSE)
data = DG(X,Z,gamma_X,gamma_Z,Sigma_e,outcome="continuous")


y = as.vector(SIMEX_EST(data,PS="logistic",Psi = seq(0,2,length=10),p_x=length(gamma_X),
              K=5, Sigma_e=diag(0.2,p_x)))
V = diag(1,length(y),length(y))

VSE_PS(V,y,method="lasso",cv="TRUE")
VSE_PS(V,y,method="scad",cv="TRUE")
VSE_PS(V,y,method="mcp",cv="TRUE")

[Package AteMeVs version 0.1.0 Index]