cpg {CPGLIB}R Documentation

Competing Proximal Gradients Library for Ensembles of Generalized Linear Models

Description

cpg computes the coefficients for ensembles of generalized linear models via competing proximal gradients.

Usage

cpg(
  x,
  y,
  glm_type = c("Linear", "Logistic", "Gamma", "Poisson")[1],
  G = 5,
  include_intercept = TRUE,
  alpha_s = 3/4,
  alpha_d = 1,
  lambda_sparsity,
  lambda_diversity,
  balanced_cycling = TRUE,
  permutate_search = FALSE,
  acceleration = FALSE,
  tolerance = 1e-05,
  max_iter = 1e+05
)

Arguments

x

Design matrix.

y

Response vector.

glm_type

Description of the error distribution and link function to be used for the model. Must be one of "Linear", "Logistic", "Gamma" or "Poisson". Default is "Linear".

G

Number of groups in the ensemble.

include_intercept

Argument to determine whether there is an intercept. Default is TRUE.

alpha_s

Sparsity mixing parmeter. Default is 3/4.

alpha_d

Diversity mixing parameter. Default is 1.

lambda_sparsity

Sparsity tuning parameter value.

lambda_diversity

Diversity tuning parameter value.

balanced_cycling

Argument to determine the cycling strategy for the optimal solution search. Default is TRUE.

permutate_search

Argument to determine whether permutations are used to search for the optimal solution. Default is FALSE.

acceleration

Argument to determine whether a gradient acceleration method is used. Default is FALSE.

tolerance

Convergence criteria for the coefficients. Default is 1e-3.

max_iter

Maximum number of iterations in the algorithm. Default is 1e5.

Value

An object of class cpg

Author(s)

Anthony-Alexander Christidis, anthony.christidis@stat.ubc.ca

See Also

coef.CPGLIB, predict.CPGLIB

Examples


# Data simulation
set.seed(1)
n <- 50
N <- 2000
p <- 300
beta.active <- c(abs(runif(p, 0, 1/2))*(-1)^rbinom(p, 1, 0.3))
# Parameters
p.active <- 150
beta <- c(beta.active[1:p.active], rep(0, p-p.active))
Sigma <- matrix(0, p, p)
Sigma[1:p.active, 1:p.active] <- 0.5
diag(Sigma) <- 1

# Train data
x.train <- mvnfast::rmvn(n, mu = rep(0, p), sigma  =  Sigma) 
prob.train <- exp(x.train %*% beta)/
              (1+exp(x.train %*% beta))
y.train <- rbinom(n, 1, prob.train)
# Test data
x.test <- mvnfast::rmvn(N, mu = rep(0, p), sigma  =  Sigma)
prob.test <- exp(x.test %*% beta)/
             (1+exp(x.test %*% beta))
y.test <- rbinom(N, 1, prob.test)

# CPGLIB - Multiple Groups
cpg.out <- cpg(x.train, y.train,
               glm_type = "Logistic",
               G = 5, include_intercept = TRUE,
               alpha_s = 3/4, alpha_d = 1,
               lambda_sparsity = 0.01, lambda_diversity = 1,
               balanced_cycling = TRUE,
               tolerance = 1e-5, max_iter = 1e5)

# Predictions
cpg.prob <- predict(cpg.out, newx = x.test, type = "prob", 
                    groups = 1:cpg.out$G, ensemble_type = "Model-Avg")
cpg.class <- predict(cpg.out, newx = x.test, type = "prob", 
                     groups = 1:cpg.out$G, ensemble_type = "Model-Avg")
plot(prob.test, cpg.prob, pch = 20)
abline(h = 0.5,v = 0.5)
mean((prob.test-cpg.prob)^2)
mean(abs(y.test-cpg.class))




[Package CPGLIB version 1.0.1 Index]