Split {sharp} | R Documentation |
Splitting observations into non-overlapping sets
Description
Generates a list of length(tau)
non-overlapping sets of observation
IDs.
Usage
Split(data, family = NULL, tau = c(0.5, 0.25, 0.25))
Arguments
data |
vector or matrix of data. In regression, this should be the outcome data. |
family |
type of regression model. This argument is defined as in
|
tau |
vector of the proportion of observations in each of the sets. |
Details
With categorical outcomes (i.e. family
argument is set to
"binomial"
, "multinomial"
or "cox"
), the split is done
such that the proportion of observations from each of the categories in
each of the sets is representative of that of the full sample.
Value
A list of length length(tau)
with sets of non-overlapping
observation IDs.
Examples
# Splitting into 3 sets
simul <- SimulateRegression()
ids <- Split(data = simul$ydata)
lapply(ids, length)
# Balanced splits with respect to a binary variable
simul <- SimulateRegression(family = "binomial")
ids <- Split(data = simul$ydata, family = "binomial")
lapply(ids, FUN = function(x) {
table(simul$ydata[x, ])
})