R: Value variables for prediction a categorical outcome.

value_variables_C {vtreat}

R Documentation

Value variables for prediction a categorical outcome.

Description

Value variables for prediction a categorical outcome.

Usage

value_variables_C(
  dframe,
  varlist,
  outcomename,
  outcometarget,
  ...,
  weights = c(),
  minFraction = 0.02,
  smFactor = 0,
  rareCount = 0,
  rareSig = 1,
  collarProb = 0,
  scale = FALSE,
  doCollar = FALSE,
  splitFunction = NULL,
  ncross = 3,
  forceSplit = FALSE,
  catScaling = TRUE,
  verbose = FALSE,
  parallelCluster = NULL,
  use_parallel = TRUE,
  customCoders = list(c.PiecewiseV.num = vtreat::solve_piecewisec, n.PiecewiseV.num =
    vtreat::solve_piecewise, c.knearest.num = vtreat::square_windowc, n.knearest.num =
    vtreat::square_window),
  codeRestriction = c("PiecewiseV", "knearest", "clean", "isBAD", "catB", "catP"),
  missingness_imputation = NULL,
  imputation_map = NULL
)

Arguments

`dframe`	Data frame to learn treatments from (training data), must have at least 1 row.
`varlist`	Names of columns to treat (effective variables).
`outcomename`	Name of column holding outcome variable. dframe[[outcomename]] must be only finite non-missing values.
`outcometarget`	Value/level of outcome to be considered "success", and there must be a cut such that dframe[[outcomename]]==outcometarget at least twice and dframe[[outcomename]]!=outcometarget at least twice.
`...`	no additional arguments, declared to forced named binding of later arguments
`weights`	optional training weights for each row
`minFraction`	optional minimum frequency a categorical level must have to be converted to an indicator column.
`smFactor`	optional smoothing factor for impact coding models.
`rareCount`	optional integer, allow levels with this count or below to be pooled into a shared rare-level. Defaults to 0 or off.
`rareSig`	optional numeric, suppress levels from pooling at this significance value greater. Defaults to NULL or off.
`collarProb`	what fraction of the data (pseudo-probability) to collar data at if doCollar is set during `prepare.treatmentplan`.
`scale`	optional if TRUE replace numeric variables with regression ("move to outcome-scale").
`doCollar`	optional if TRUE collar numeric variables by cutting off after a tail-probability specified by collarProb during treatment design.
`splitFunction`	(optional) see vtreat::buildEvalSets .
`ncross`	optional scalar>=2 number of cross-validation rounds to design.
`forceSplit`	logical, if TRUE force cross-validated significance calculations on all variables.
`catScaling`	optional, if TRUE use glm() linkspace, if FALSE use lm() for scaling.
`verbose`	if TRUE print progress.
`parallelCluster`	(optional) a cluster object created by package parallel or package snow.
`use_parallel`	logical, if TRUE use parallel methods.
`customCoders`	additional coders to use for variable importance estimate.
`codeRestriction`	codes to restrict to for variable importance estimate.
`missingness_imputation`	function of signature f(values: numeric, weights: numeric), simple missing value imputer.
`imputation_map`	map from column names to functions of signature f(values: numeric, weights: numeric), simple missing value imputers.

Value

table of variable valuations

[Package vtreat version 1.6.5 Index]