R: q_model class object

q_model {polle}

R Documentation

q_model class object

Description

Use q_glm(), q_glmnet(), q_rf(), and q_sl() to construct an outcome regression model/Q-model object. The constructors are used as input for policy_eval() and policy_learn().

Usage

q_glm(
  formula = ~A * .,
  family = gaussian(),
  model = FALSE,
  na.action = na.pass,
  ...
)

q_glmnet(
  formula = ~A * .,
  family = "gaussian",
  alpha = 1,
  s = "lambda.min",
  ...
)

q_rf(
  formula = ~.,
  num.trees = c(250, 500, 750),
  mtry = NULL,
  cv_args = list(nfolds = 3, rep = 1),
  ...
)

q_sl(
  formula = ~.,
  SL.library = c("SL.mean", "SL.glm"),
  env = as.environment("package:SuperLearner"),
  onlySL = TRUE,
  discreteSL = FALSE,
  ...
)

q_xgboost(
  formula = ~.,
  objective = "reg:squarederror",
  params = list(),
  nrounds,
  max_depth = 6,
  eta = 0.3,
  nthread = 1,
  cv_args = list(nfolds = 3, rep = 1)
)

Arguments

`formula`	An object of class formula specifying the design matrix for the outcome regression model/Q-model at the given stage. The action at the given stage is always denoted 'A', see examples. Use `get_history_names()` to see the additional available variable names.
`family`	A description of the error distribution and link function to be used in the model.
`model`	(Only used by `q_glm`) If `FALSE` model frame will not be saved.
`na.action`	(Only used by `q_glm`) A function which indicates what should happen when the data contain NAs, see na.pass.
`...`	Additional arguments passed to `glm()`, glmnet::glmnet, ranger::ranger or SuperLearner::SuperLearner.
`alpha`	(Only used by `q_glmnet`) The elasticnet mixing parameter between 0 and 1. alpha equal to 1 is the lasso penalty, and alpha equal to 0 the ridge penalty.
`s`	(Only used by `q_glmnet`) Value(s) of the penalty parameter lambda at which predictions are required, see `glmnet::predict.glmnet()`.
`num.trees`	(Only used by `q_rf`) Number of trees.
`mtry`	(Only used by `q_rf`) Number of variables to possibly split at in each node.
`cv_args`	(Only used by `q_rf`) Cross-validation parameters. Only used if multiple hyper-parameters are given. `K` is the number of folds and `rep` is the number of replications.
`SL.library`	(Only used by `q_sl`) Either a character vector of prediction algorithms or a list containing character vectors, see SuperLearner::SuperLearner.
`env`	(Only used by `q_sl`) Environment containing the learner functions. Defaults to the calling environment.
`onlySL`	(Only used by `q_sl`) Logical. If TRUE, only saves and computes predictions for algorithms with non-zero coefficients in the super learner object.
`discreteSL`	(Only used by `q_sl`) If TRUE, select the model with the lowest cross-validated risk.
`objective`	(Only used by `q_xgboost`) specify the learning task and the corresponding learning objective, see xgboost::xgboost.
`params`	(Only used by `q_xgboost`) list of parameters.
`nrounds`	(Only used by `q_xgboost`) max number of boosting iterations.
`max_depth`	(Only used by `q_xgboost`) maximum depth of a tree.
`eta`	(Only used by `q_xgboost`) learning rate.
`nthread`	(Only used by `q_xgboost`) number of threads.

Details

q_glm() is a wrapper of glm() (generalized linear model).
q_glmnet() is a wrapper of glmnet::glmnet() (generalized linear model via penalized maximum likelihood).
q_rf() is a wrapper of ranger::ranger() (random forest). When multiple hyper-parameters are given, the model with the lowest cross-validation error is selected.
q_sl() is a wrapper of SuperLearner::SuperLearner (ensemble model). q_xgboost() is a wrapper of xgboost::xgboost.

Value

q_model object: function with arguments 'AH' (combined action and history matrix) and 'V_res' (residual value/expected utility).

Examples

library("polle")
### Single stage case
d1 <- sim_single_stage(5e2, seed=1)
pd1 <- policy_data(d1,
                   action="A",
                   covariates=list("Z", "B", "L"),
                   utility="U")
pd1

# available history variable names for the outcome regression:
get_history_names(pd1)

# evaluating the static policy a=1 using inverse
# propensity weighting based on the given Q-model:
pe1 <- policy_eval(type = "or",
                   policy_data = pd1,
                   policy = policy_def(1, name = "A=1"),
                   q_model = q_glm(formula = ~A*.))
pe1

# getting the fitted Q-function values
head(predict(get_q_functions(pe1), pd1))

### Two stages:
d2 <- sim_two_stage(5e2, seed=1)
pd2 <- policy_data(d2,
                  action = c("A_1", "A_2"),
                  covariates = list(L = c("L_1", "L_2"),
                                    C = c("C_1", "C_2")),
                  utility = c("U_1", "U_2", "U_3"))
pd2

# available full history variable names at each stage:
get_history_names(pd2, stage = 1)
get_history_names(pd2, stage = 2)

# evaluating the static policy a=1 using outcome
# regression based on a glm model for each stage:
pe2 <- policy_eval(type = "or",
            policy_data = pd2,
            policy = policy_def(1, reuse = TRUE, name = "A=1"),
            q_model = list(q_glm(~ A * L_1),
                           q_glm(~ A * (L_1 + L_2))),
            q_full_history = TRUE)
pe2

# getting the fitted Q-function values
head(predict(get_q_functions(pe2), pd2))

[Package polle version 1.4 Index]