| cv.splitSelect {splitSelect} | R Documentation | 
Split Selection Modeling for Low-Dimensional Data - Cross-Validation
Description
cv.splitSelect performs the best split selection algorithm with cross-validation
Usage
cv.splitSelect(
  x,
  y,
  intercept = TRUE,
  G,
  use.all = TRUE,
  family = c("gaussian", "binomial")[1],
  group.model = c("glmnet", "LS", "Logistic")[1],
  alphas = 0,
  nsample = NULL,
  fix.partition = NULL,
  fix.split = NULL,
  nfolds = 10,
  parallel = FALSE,
  cores = getOption("mc.cores", 2L)
)
Arguments
x | 
 Design matrix.  | 
y | 
 Response vector.  | 
intercept | 
 Boolean variable to determine if there is intercept (default is TRUE) or not.  | 
G | 
 Number of groups into which the variables are split. Can have more than one value.  | 
use.all | 
 Boolean variable to determine if all variables must be used (default is TRUE).  | 
family | 
 Description of the error distribution and link function to be used for the model. Must be one of "gaussian" or "binomial".  | 
group.model | 
 Model used for the groups. Must be one of "glmnet" or "LS".  | 
alphas | 
 Elastic net mixing parameter. Should be between 0 (default) and 1.  | 
nsample | 
 Number of sample splits for each value of G. If NULL, then all splits will be considered (unless there is overflow).  | 
fix.partition | 
 Optional list with G elements indicating the partitions (in each row) to be considered for the splits.  | 
fix.split | 
 Optional matrix with p columns indicating the groups (in each row) to be considered for the splits.  | 
nfolds | 
 Number of folds for the cross-validation procedure.  | 
parallel | 
 Boolean variable to determine if parallelization of the function. Default is FALSE.  | 
cores | 
 Number of cores for the parallelization for the function.  | 
Value
An object of class cv.splitSelect.
Author(s)
Anthony-Alexander Christidis, anthony.christidis@stat.ubc.ca
See Also
coef.cv.splitSelect, predict.cv.splitSelect
Examples
# Setting the parameters
p <- 4
n <- 30
n.test <- 5000
beta <- rep(5,4)
rho <- 0.1
r <- 0.9
SNR <- 3
# Creating the target matrix with "kernel" set to rho
target_cor <- function(r, p){
  Gamma <- diag(p)
  for(i in 1:(p-1)){
    for(j in (i+1):p){
      Gamma[i,j] <- Gamma[j,i] <- r^(abs(i-j))
    }
  }
  return(Gamma)
}
# AR Correlation Structure
Sigma.r <- target_cor(r, p)
Sigma.rho <- target_cor(rho, p)
sigma.epsilon <- as.numeric(sqrt((t(beta) %*% Sigma.rho %*% beta)/SNR))
# Simulate some data
x.train <- mvnfast::rmvn(30, mu=rep(0,p), sigma=Sigma.r)
y.train <- 1 + x.train %*% beta + rnorm(n=n, mean=0, sd=sigma.epsilon)
# Generating the coefficients for a fixed partition of the variables
split.out <- cv.splitSelect(x.train, y.train, G=2, use.all=TRUE,
                            fix.partition=list(matrix(c(2,2), 
                                               ncol=2, byrow=TRUE)), 
                            fix.split=NULL,
                            intercept=TRUE, group.model="glmnet", alphas=0, nfolds=10)