split_pilot_set {stratamatch} | R Documentation |
Split data into pilot and analysis sets
Description
Given a data set and some parameters about how to split the data, this
function partitions the data accordingly and returns the partitioned data as
a list containing the analysis_set
and pilot_set
.
Usage
split_pilot_set(
data,
treat,
pilot_fraction = 0.1,
pilot_size = NULL,
group_by_covariates = NULL
)
Arguments
data |
|
treat |
string giving the name of column designating treatment assignment |
pilot_fraction |
numeric between 0 and 1 giving the proportion of controls to be allotted for building the prognostic score (default = 0.1) |
pilot_size |
alternative to pilot_fraction. Approximate number of
observations to be used in pilot set. Note that the actual pilot set size
returned may not be exactly |
group_by_covariates |
character vector giving the names of covariates to be grouped by (optional). If specified, the pilot set will be sampled in a stratified manner, so that the composition of the pilot set reflects the composition of the whole data set in terms of these covariates. The specified covariates must be categorical. |
Value
a list with analaysis_set and pilot_set
Examples
dat <- make_sample_data()
splt <- split_pilot_set(dat, "treat", 0.2)
# can be passed into auto_stratify if desired
a.strat <- auto_stratify(splt$analysis_set, "treat", outcome ~ X1,
pilot_sample = splt$pilot_set
)