gds {hdme} | R Documentation |
Generalized Dantzig Selector
Description
Generalized Dantzig Selector
Usage
gds(X, y, lambda = NULL, family = "gaussian", weights = NULL)
Arguments
X |
Design matrix. |
y |
Vector of the continuous response value. |
lambda |
Regularization parameter. Only a single value is supported. |
family |
Use "gaussian" for linear regression, "binomial" for logistic regression and "poisson" for Poisson regression. |
weights |
A vector of weights for each row of |
Value
Intercept and coefficients at the values of lambda specified.
References
Candes E, Tao T (2007). “The Dantzig selector: Statistical estimation when p is much larger than n.” Ann. Statist., 35(6), 2313–2351.
James GM, Radchenko P (2009). “A generalized Dantzig selector with shrinkage tuning.” Biometrika, 96(2), 323-337.
Examples
# Example with logistic regression
n <- 1000 # Number of samples
p <- 10 # Number of covariates
X <- matrix(rnorm(n * p), nrow = n) # True (latent) variables # Design matrix
beta <- c(seq(from = 0.1, to = 1, length.out = 5), rep(0, p-5)) # True regression coefficients
y <- rbinom(n, 1, (1 + exp(-X %*% beta))^(-1)) # Binomially distributed response
fit <- gds(X, y, family = "binomial")
print(fit)
plot(fit)
coef(fit)
# Try with more penalization
fit <- gds(X, y, family = "binomial", lambda = 0.1)
coef(fit)
coef(fit, all = TRUE)
# Case weighting
# Assume we wish to put more emphasis on predicting the positive cases correctly
# In this case we give the 1s three times the weight of the zeros.
weights <- (y == 0) * 1 + (y == 1) * 3
fit_w <- gds(X, y, family = "binomial", weights = weights, lambda = 0.1)
# Next we test this on a new dataset, generated with the same parameters
X_new <- matrix(rnorm(n * p), nrow = n)
y_new <- rbinom(n, 1, (1 + exp(-X_new %*% beta))^(-1))
# We use a 50 % threshold as classification rule
# Unweighted classifcation
classification <- ((1 + exp(- fit$intercept - X_new %*% fit$beta))^(-1) > 0.5) * 1
# Weighted classification
classification_w <- ((1 + exp(- fit_w$intercept - X_new %*% fit_w$beta))^(-1) > 0.5) * 1
# As expected, the weighted classification predicts many more 1s than 0s, since
# these are heavily up-weighted
table(classification, classification_w)
# Here we compare the performance of the weighted and unweighted models.
# The weighted model gets most of the 1s right, while the unweighted model
# gets the highest overall performance.
table(classification, y_new)
table(classification_w, y_new)
[Package hdme version 0.6.0 Index]