gridscore {rchemo}R Documentation

Tuning of predictive models on a validation dataset

Description

Functions for tuning predictive models on a validation set.

The functions return "scores" (average error rates) of predictions for a given model and a grid of parameter values, calculated on a validation dataset.

- gridscore: Can be used for any model.

- gridscorelv: Specific to models using regularization by latent variables (LVs) (e.g. PLSR). Much faster than gridscore.

- gridscorelb: Specific to models using ridge regularization (e.g. RR). Much faster than gridscore.

Usage


gridscore(Xtrain, Ytrain, X, Y, score, fun, pars, verb = FALSE)

gridscorelv(Xtrain, Ytrain, X, Y, score, fun, nlv, pars = NULL, verb = FALSE)

gridscorelb(Xtrain, Ytrain, X, Y, score, fun, lb, pars = NULL, verb = FALSE)

Arguments

Xtrain

Training X-data (n, p).

Ytrain

Training Y-data (n, q).

X

Validation X-data (n, p).

Y

Validation Y-data (n, q).

score

A function calculating a prediction score (e.g. msep).

fun

A function corresponding to the predictive model.

nlv

For gridscorelv. A vector of numbers of LVs.

lb

For gridscorelb. A vector of ridge regulariation parameters.

pars

A list of named vectors. Each vector must correspond to an argument of the model function and gives the parameter values to consider for this argument. (see details)

verb

Logical. If TRUE, fitting information are printed.

Details

Argument pars (the grid) must be a list of named vectors, each vector corresponding to an argument of the model function and giving the parameter values to consider for this argument. This list can eventually be built with function mpars, which returns all the combinations of the input parameters, see the examples.

For gridscorelv, pars must not contain nlv (nb. LVs), and for gridscorelb, lb (regularization parameter lambda).

Value

A dataframe with the prediction scores for the grid.

Note

Examples are given: - with PLSR, using gridscore and gridscorelv (much faster) - with PLSLDA, using gridscore and gridscorelv (much faster) - with RR, using gridscore and gridscorelb (much faster) - with KRR, using gridscore and gridscorelb (much faster) - with LWPLSR, using gridscorelv

Examples


## EXAMPLE WITH PLSR

n <- 50 ; p <- 8
Xtrain <- matrix(rnorm(n * p), ncol = p, byrow = TRUE)
ytrain <- rnorm(n)
Ytrain <- cbind(ytrain, 10 * rnorm(n))
m <- 3
Xtest <- Xtrain[1:m, ] 
Ytest <- Ytrain[1:m, ] ; ytest <- Ytest[, 1]

nlv <- 5
pars <- mpars(nlv = 1:nlv)
pars
gridscore(
    Xtrain, Ytrain, Xtest, Ytest, 
    score = msep, 
    fun = plskern, 
    pars = pars, verb = TRUE
    )

gridscorelv(
    Xtrain, Ytrain, Xtest, Ytest, 
    score = msep, 
    fun = plskern, 
    nlv = 0:nlv, verb = TRUE
    )

fm <- plskern(Xtrain, Ytrain, nlv = nlv)
pred <- predict(fm, Xtest)$pred
msep(pred, Ytest)

## EXAMPLE WITH PLSLDA

n <- 50 ; p <- 8
X <- matrix(rnorm(n * p), ncol = p, byrow = TRUE)
y <- sample(c(1, 4, 10), size = n, replace = TRUE)
Xtrain <- X ; ytrain <- y
m <- 5
Xtest <- X[1:m, ] ; ytest <- y[1:m]

nlv <- 5
pars <- mpars(nlv = 1:nlv, prior = c("unif", "prop"))
pars
gridscore(
    Xtrain, ytrain, Xtest, ytest, 
    score = err, 
    fun = plslda,
    pars = pars, verb = TRUE
    )

fm <- plslda(Xtrain, ytrain, nlv = nlv)
pred <- predict(fm, Xtest)$pred
err(pred, ytest)

pars <- mpars(prior = c("unif", "prop"))
pars
gridscorelv(
    Xtrain, ytrain, Xtest, ytest, 
    score = err, 
    fun = plslda,
    nlv = 1:nlv, pars = pars, verb = TRUE
    )
    
## EXAMPLE WITH RR

n <- 50 ; p <- 8
Xtrain <- matrix(rnorm(n * p), ncol = p, byrow = TRUE)
ytrain <- rnorm(n)
Ytrain <- cbind(ytrain, 10 * rnorm(n))
m <- 3
Xtest <- Xtrain[1:m, ] 
Ytest <- Ytrain[1:m, ] ; ytest <- Ytest[, 1]

lb <- c(.1, 1)
pars <- mpars(lb = lb)
pars
gridscore(
    Xtrain, Ytrain, Xtest, Ytest, 
    score = msep, 
    fun = rr, 
    pars = pars, verb = TRUE
    )

gridscorelb(
    Xtrain, Ytrain, Xtest, Ytest, 
    score = msep, 
    fun = rr, 
    lb = lb, verb = TRUE
    )

## EXAMPLE WITH KRR

n <- 50 ; p <- 8
Xtrain <- matrix(rnorm(n * p), ncol = p, byrow = TRUE)
ytrain <- rnorm(n)
Ytrain <- cbind(ytrain, 10 * rnorm(n))
m <- 3
Xtest <- Xtrain[1:m, ] 
Ytest <- Ytrain[1:m, ] ; ytest <- Ytest[, 1]

lb <- c(.1, 1)
gamma <- 10^(-1:1)
pars <- mpars(lb = lb, gamma = gamma)
pars
gridscore(
    Xtrain, Ytrain, Xtest, Ytest, 
    score = msep, 
    fun = krr, 
    pars = pars, verb = TRUE
    )

pars <- mpars(gamma = gamma)
gridscorelb(
    Xtrain, Ytrain, Xtest, Ytest, 
    score = msep, 
    fun = krr, 
    lb = lb, pars = pars, verb = TRUE
    )
    
## EXAMPLE WITH LWPLSR

n <- 50 ; p <- 8
Xtrain <- matrix(rnorm(n * p), ncol = p, byrow = TRUE)
ytrain <- rnorm(n)
Ytrain <- cbind(ytrain, 10 * rnorm(n))
m <- 3
Xtest <- Xtrain[1:m, ] 
Ytest <- Ytrain[1:m, ] ; ytest <- Ytest[, 1]

nlvdis <- 5
h <- c(1, Inf)
k <- c(10, 20)
nlv <- 5
pars <- mpars(nlvdis = nlvdis, diss = "mahal",
    h = h, k = k)
pars
res <- gridscorelv(
    Xtrain, Ytrain, Xtest, Ytest, 
    score = msep, 
    fun = lwplsr, 
    nlv = 0:nlv, pars = pars, verb = TRUE
    )
res


[Package rchemo version 0.1-1 Index]