cv.boss {BOSSreg} | R Documentation |

## Cross-validation for Best Orthogonalized Subset Selection (BOSS) and Forward Stepwise Selection (FS).

### Description

Cross-validation for Best Orthogonalized Subset Selection (BOSS) and Forward Stepwise Selection (FS).

### Usage

```
cv.boss(
x,
y,
maxstep = min(nrow(x) - intercept - 1, ncol(x)),
intercept = TRUE,
n.folds = 10,
n.rep = 1,
show.warning = TRUE,
...
)
```

### Arguments

`x` |
A matrix of predictors, see |

`y` |
A vector of response variable, see |

`maxstep` |
Maximum number of steps performed. Default is |

`intercept` |
Logical, whether to fit an intercept term. Default is TRUE. |

`n.folds` |
The number of cross validation folds. Default is 10. |

`n.rep` |
The number of replications of cross validation. Default is 1. |

`show.warning` |
Whether to display a warning if CV is only performed for a subset of candidates. e.g. when n<p and 10-fold. Default is TRUE. |

`...` |
Arguments to |

### Details

This function fits BOSS and FS (`boss`

) on the full dataset, and performs `n.folds`

cross-validation. The cross-validation process can be repeated `n.rep`

times to evaluate the
out-of-sample (OOS) performance for the candidate subsets given by both methods.

### Value

boss: An object

`boss`

that fits on the full dataset.n.folds: The number of cross validation folds.

cvm.fs: Mean OOS deviance for each candidate given by FS.

cvm.boss: Mean OSS deviance for each candidate given by BOSS.

i.min.fs: The index of minimum cvm.fs.

i.min.boss: The index of minimum cvm.boss.

### Author(s)

Sen Tian

### References

Tian, S., Hurvich, C. and Simonoff, J. (2021), On the Use of Information Criteria for Subset Selection in Least Squares Regression. https://arxiv.org/abs/1911.10191

BOSSreg Vignette https://github.com/sentian/BOSSreg/blob/master/r-package/vignettes/BOSSreg.pdf

### See Also

`predict`

and `coef`

methods for `cv.boss`

object, and the `boss`

function

### Examples

```
## Generate a trivial dataset, X has mean 0 and norm 1, y has mean 0
set.seed(11)
n = 20
p = 5
x = matrix(rnorm(n*p), nrow=n, ncol=p)
x = scale(x, center = colMeans(x))
x = scale(x, scale = sqrt(colSums(x^2)))
beta = c(1, 1, 0, 0, 0)
y = x%*%beta + scale(rnorm(20, sd=0.01), center = TRUE, scale = FALSE)
## Perform 10-fold CV without replication
boss_cv_result = cv.boss(x, y)
## Get the coefficient vector of BOSS that gives minimum CV OSS score (S3 method for cv.boss)
beta_boss_cv = coef(boss_cv_result)
# the above is equivalent to
boss_result = boss_cv_result$boss
beta_boss_cv = boss_result$beta_boss[, boss_cv_result$i.min.boss, drop=FALSE]
## Get the fitted values of BOSS-CV (S3 method for cv.boss)
mu_boss_cv = predict(boss_cv_result, newx=x)
# the above is equivalent to
mu_boss_cv = cbind(1,x) %*% beta_boss_cv
## Get the coefficient vector of FS that gives minimum CV OSS score (S3 method for cv.boss)
beta_fs_cv = coef(boss_cv_result, method='fs')
## Get the fitted values of FS-CV (S3 method for cv.boss)
mu_fs_cv = predict(boss_cv_result, newx=x, method='fs')
```

*BOSSreg*version 0.2.0 Index]