holdout {stuart} | R Documentation |
Data selection for holdout validation.
Description
Split a data.frame
into two subsets for holdout validation.
Usage
holdout(data, prop = 0.5, grouping = NULL, seed = NULL, determined = NULL)
Arguments
data |
A |
prop |
A single value or vector of proportions of data in calibration sample. Defaults to .5, for an even split. |
grouping |
Name of the grouping variable. Providing a grouping variable ensures that the provided proportion is selected within each group. |
seed |
A random seed. See |
determined |
Name of a variable indicating the pre-determined assignment to the calibration or the validation sample. This variable must be a factor containing only |
Value
Returns a list containing two data.frame
s, called calibrate and validate. The first corresponds to the calibration sample, the second to the validation sample.
Author(s)
Martin Schultze
See Also
Examples
# seeded selection, 25% validation sample
data(fairplayer)
split <- holdout(fairplayer, .75, seed = 55635)
lapply(split, nrow) # check size of samples