value_variables_C {vtreat} | R Documentation |
Value variables for prediction a categorical outcome.
Description
Value variables for prediction a categorical outcome.
Usage
value_variables_C(
dframe,
varlist,
outcomename,
outcometarget,
...,
weights = c(),
minFraction = 0.02,
smFactor = 0,
rareCount = 0,
rareSig = 1,
collarProb = 0,
scale = FALSE,
doCollar = FALSE,
splitFunction = NULL,
ncross = 3,
forceSplit = FALSE,
catScaling = TRUE,
verbose = FALSE,
parallelCluster = NULL,
use_parallel = TRUE,
customCoders = list(c.PiecewiseV.num = vtreat::solve_piecewisec, n.PiecewiseV.num =
vtreat::solve_piecewise, c.knearest.num = vtreat::square_windowc, n.knearest.num =
vtreat::square_window),
codeRestriction = c("PiecewiseV", "knearest", "clean", "isBAD", "catB", "catP"),
missingness_imputation = NULL,
imputation_map = NULL
)
Arguments
dframe |
Data frame to learn treatments from (training data), must have at least 1 row. |
varlist |
Names of columns to treat (effective variables). |
outcomename |
Name of column holding outcome variable. dframe[[outcomename]] must be only finite non-missing values. |
outcometarget |
Value/level of outcome to be considered "success", and there must be a cut such that dframe[[outcomename]]==outcometarget at least twice and dframe[[outcomename]]!=outcometarget at least twice. |
... |
no additional arguments, declared to forced named binding of later arguments |
weights |
optional training weights for each row |
minFraction |
optional minimum frequency a categorical level must have to be converted to an indicator column. |
smFactor |
optional smoothing factor for impact coding models. |
rareCount |
optional integer, allow levels with this count or below to be pooled into a shared rare-level. Defaults to 0 or off. |
rareSig |
optional numeric, suppress levels from pooling at this significance value greater. Defaults to NULL or off. |
collarProb |
what fraction of the data (pseudo-probability) to collar data at if doCollar is set during |
scale |
optional if TRUE replace numeric variables with regression ("move to outcome-scale"). |
doCollar |
optional if TRUE collar numeric variables by cutting off after a tail-probability specified by collarProb during treatment design. |
splitFunction |
(optional) see vtreat::buildEvalSets . |
ncross |
optional scalar>=2 number of cross-validation rounds to design. |
forceSplit |
logical, if TRUE force cross-validated significance calculations on all variables. |
catScaling |
optional, if TRUE use glm() linkspace, if FALSE use lm() for scaling. |
verbose |
if TRUE print progress. |
parallelCluster |
(optional) a cluster object created by package parallel or package snow. |
use_parallel |
logical, if TRUE use parallel methods. |
customCoders |
additional coders to use for variable importance estimate. |
codeRestriction |
codes to restrict to for variable importance estimate. |
missingness_imputation |
function of signature f(values: numeric, weights: numeric), simple missing value imputer. |
imputation_map |
map from column names to functions of signature f(values: numeric, weights: numeric), simple missing value imputers. |
Value
table of variable valuations