splitSelect {splitSelect} | R Documentation |
Best Split Selection Modeling for Low-Dimensional Data
Description
splitSelect
performs the best split selection algorithm.
Usage
splitSelect(
x,
y,
intercept = TRUE,
G,
use.all = TRUE,
family = c("gaussian", "binomial")[1],
group.model = c("glmnet", "LS", "Logistic")[1],
lambdas = NULL,
alphas = 0,
nsample = NULL,
fix.partition = NULL,
fix.split = NULL,
parallel = FALSE,
cores = getOption("mc.cores", 2L),
verbose = TRUE
)
Arguments
x |
Design matrix. |
y |
Response vector. |
intercept |
Boolean variable to determine if there is intercept (default is TRUE) or not. |
G |
Number of groups into which the variables are split. Can have more than one value. |
use.all |
Boolean variable to determine if all variables must be used (default is TRUE). |
family |
Description of the error distribution and link function to be used for the model. Must be one of "gaussian" or "binomial". |
group.model |
Model used for the groups. Must be one of "glmnet" or "LS". |
lambdas |
The shinkrage parameters for the "glmnet" regularization. If NULL (default), optimal values are chosen. |
alphas |
Elastic net mixing parameter. Should be between 0 (default) and 1. |
nsample |
Number of sample splits for each value of G. If NULL, then all splits will be considered (unless there is overflow). |
fix.partition |
Optional list with G elements indicating the partitions (in each row) to be considered for the splits. |
fix.split |
Optional matrix with p columns indicating the groups (in each row) to be considered for the splits. |
parallel |
Boolean variable to determine if parallelization of the function. Default is FALSE. |
cores |
Number of cores for the parallelization for the function. |
verbose |
Boolean variable to determine if console output for cross-validation progress is printed (default is TRUE). |
Value
An object of class splitSelect.
Author(s)
Anthony-Alexander Christidis, anthony.christidis@stat.ubc.ca
See Also
coef.splitSelect
, predict.splitSelect
Examples
# Setting the parameters
p <- 4
n <- 30
n.test <- 5000
beta <- rep(5,4)
rho <- 0.1
r <- 0.9
SNR <- 3
# Creating the target matrix with "kernel" set to rho
target_cor <- function(r, p){
Gamma <- diag(p)
for(i in 1:(p-1)){
for(j in (i+1):p){
Gamma[i,j] <- Gamma[j,i] <- r^(abs(i-j))
}
}
return(Gamma)
}
# AR Correlation Structure
Sigma.r <- target_cor(r, p)
Sigma.rho <- target_cor(rho, p)
sigma.epsilon <- as.numeric(sqrt((t(beta) %*% Sigma.rho %*% beta)/SNR))
# Simulate some data
x.train <- mvnfast::rmvn(30, mu=rep(0,p), sigma=Sigma.r)
y.train <- 1 + x.train %*% beta + rnorm(n=n, mean=0, sd=sigma.epsilon)
# Generating the coefficients for a fixed partition of the variables
split.out <- splitSelect(x.train, y.train, G=2, use.all=TRUE,
fix.partition=list(matrix(c(2,2),
ncol=2, byrow=TRUE)),
fix.split=NULL,
intercept=TRUE, group.model="glmnet", alphas=0)