cv.customizedGlmnet {customizedTraining}  R Documentation 
Does kfold crossvalidation for customizedGlmnet and returns a values for
G
and lambda
cv.customizedGlmnet(xTrain, yTrain, xTest = NULL, groupid = NULL, Gs = NULL,
dendrogram = NULL, dendrogramCV = NULL, lambda = NULL,
nfolds = 10, foldid = NULL, keep = FALSE,
family = c("gaussian", "binomial", "multinomial"), verbose = FALSE)
xTrain 
an nbyp matrix of training covariates 
yTrain 
a lengthn vector of training responses. Numeric for family = 
xTest 
an mbyp matrix of test covariates. May be left NULL, in which case cross validation predictions are made internally on the training set and no test predictions are returned. 
groupid 
an optional lengthm vector of group memberships for the test set. If
specified, customized training subsets are identified using the union of
nearest neighbor sets for each test group, in which case crossvalidation is
used only to select the regularization parameter 
Gs 
a vector of positive integers indicating the numbers of clusters over which to
perform crossvalidation to determine the best number. Ignored if 
dendrogram 
optional output from 
dendrogramCV 
optional output from 
lambda 
sequence of values to use for the regularization parameter lambda. Recomended
to leave as NULL and allow 
nfolds 
number of folds – default is 10. Ignored if foldid is specified 
foldid 
an optional lengthn vector of fold memberships used for crossvalidation 
keep 
Should fitted values on the training set from cross validation be included in output? Default is FALSE. 
family 
response type 
verbose 
Should progress be printed to console as folds are evaluated during crossvalidation? Default is FALSE. 
an object of class cv.customizedGlmnet
call 
the call that produced this object 
G.min 
unless groupid is specified, the number of clusters minimizing CV error 
lambda 
the sequence of values of the regularization parameter 
lambda.min 
the value of the regularization parameter 
error 
a matrix containing the CV error for each 
fit 
a 
prediction 
a lengthm vector of predictions for the test set, using the tuning parameters
which minimize crossvalidation error. Only returned if 
selected 
a list of nonzero variables for each customized training set, using

cv.fit 
a array containing fitted values on the training set from cross validation.
Only returned if 
Scott Powers, Trevor Hastie, Robert Tibshirani
Scott Powers, Trevor Hastie and Robert Tibshirani (2015) "Customized training with an application to mass specrometric imaging of gastric cancer data." Annals of Applied Statistics 9, 4:17091725.
customizedGlmnet
, plot.cv.customizedGlmnet
,
predict.cv.customizedGlmnet
require(glmnet)
# Simulate synthetic data
n = m = 150
p = 50
q = 5
K = 3
sigmaC = 10
sigmaX = sigmaY = 1
set.seed(5914)
beta = matrix(0, nrow = p, ncol = K)
for (k in 1:K) beta[sample(1:p, q), k] = 1
c = matrix(rnorm(K*p, 0, sigmaC), K, p)
eta = rnorm(K)
pi = (exp(eta)+1)/sum(exp(eta)+1)
z = t(rmultinom(m + n, 1, pi))
x = crossprod(t(z), c) + matrix(rnorm((m + n)*p, 0, sigmaX), m + n, p)
y = rowSums(z*(crossprod(t(x), beta))) + rnorm(m + n, 0, sigmaY)
x.train = x[1:n, ]
y.train = y[1:n]
x.test = x[n + 1:m, ]
y.test = y[n + 1:m]
foldid = sample(rep(1:10, length = nrow(x.train)))
# Example 1: Use clustering to fit the customized training model to training
# and test data with no predefined testset blocks
fit1 = cv.customizedGlmnet(x.train, y.train, x.test, Gs = c(1, 2, 3, 5),
family = "gaussian", foldid = foldid)
# Print the optimal number of groups and value of lambda:
fit1$G.min
fit1$lambda.min
# Print the customized training model fit:
fit1
# Compute test error using the predict function:
mean((y[n + 1:m]  predict(fit1))^2)
# Plot nonzero coefficients by group:
plot(fit1)
# Example 2: If the test set has predefined blocks, use these blocks to define
# the customized training sets, instead of using clustering.
foldid = apply(z == 1, 1, which)[1:n]
group.id = apply(z == 1, 1, which)[n + 1:m]
fit2 = cv.customizedGlmnet(x.train, y.train, x.test, group.id, foldid = foldid)
# Print the optimal value of lambda:
fit2$lambda.min
# Print the customized training model fit:
fit2
# Compute test error using the predict function:
mean((y[n + 1:m]  predict(fit2))^2)
# Plot nonzero coefficients by group:
plot(fit2)
# Example 3: If there is no test set, but the training set is organized into
# blocks, you can do cross validation with these blocks as the basis for the
# customized training sets.
fit3 = cv.customizedGlmnet(x.train, y.train, foldid = foldid)
# Print the optimal value of lambda:
fit3$lambda.min
# Print the customized training model fit:
fit3
# Compute test error using the predict function:
mean((y[n + 1:m]  predict(fit3))^2)
# Plot nonzero coefficients by group:
plot(fit3)