q_model {polle}R Documentation

q_model class object

Description

Use q_glm(), q_glmnet(), q_rf(), and q_sl() to construct an outcome regression model/Q-model object. The constructors are used as input for policy_eval() and policy_learn().

Usage

q_glm(
  formula = ~A * .,
  family = gaussian(),
  model = FALSE,
  na.action = na.pass,
  ...
)

q_glmnet(
  formula = ~A * .,
  family = "gaussian",
  alpha = 1,
  s = "lambda.min",
  ...
)

q_rf(
  formula = ~.,
  num.trees = c(250, 500, 750),
  mtry = NULL,
  cv_args = list(nfolds = 3, rep = 1),
  ...
)

q_sl(
  formula = ~.,
  SL.library = c("SL.mean", "SL.glm"),
  env = as.environment("package:SuperLearner"),
  onlySL = TRUE,
  discreteSL = FALSE,
  ...
)

q_xgboost(
  formula = ~.,
  objective = "reg:squarederror",
  params = list(),
  nrounds,
  max_depth = 6,
  eta = 0.3,
  nthread = 1,
  cv_args = list(nfolds = 3, rep = 1)
)

Arguments

formula

An object of class formula specifying the design matrix for the outcome regression model/Q-model at the given stage. The action at the given stage is always denoted 'A', see examples. Use get_history_names() to see the additional available variable names.

family

A description of the error distribution and link function to be used in the model.

model

(Only used by q_glm) If FALSE model frame will not be saved.

na.action

(Only used by q_glm) A function which indicates what should happen when the data contain NAs, see na.pass.

...

Additional arguments passed to glm(), glmnet::glmnet, ranger::ranger or SuperLearner::SuperLearner.

alpha

(Only used by q_glmnet) The elasticnet mixing parameter between 0 and 1. alpha equal to 1 is the lasso penalty, and alpha equal to 0 the ridge penalty.

s

(Only used by q_glmnet) Value(s) of the penalty parameter lambda at which predictions are required, see glmnet::predict.glmnet().

num.trees

(Only used by q_rf) Number of trees.

mtry

(Only used by q_rf) Number of variables to possibly split at in each node.

cv_args

(Only used by q_rf) Cross-validation parameters. Only used if multiple hyper-parameters are given. K is the number of folds and rep is the number of replications.

SL.library

(Only used by q_sl) Either a character vector of prediction algorithms or a list containing character vectors, see SuperLearner::SuperLearner.

env

(Only used by q_sl) Environment containing the learner functions. Defaults to the calling environment.

onlySL

(Only used by q_sl) Logical. If TRUE, only saves and computes predictions for algorithms with non-zero coefficients in the super learner object.

discreteSL

(Only used by q_sl) If TRUE, select the model with the lowest cross-validated risk.

objective

(Only used by q_xgboost) specify the learning task and the corresponding learning objective, see xgboost::xgboost.

params

(Only used by q_xgboost) list of parameters.

nrounds

(Only used by q_xgboost) max number of boosting iterations.

max_depth

(Only used by q_xgboost) maximum depth of a tree.

eta

(Only used by q_xgboost) learning rate.

nthread

(Only used by q_xgboost) number of threads.

Details

q_glm() is a wrapper of glm() (generalized linear model).
q_glmnet() is a wrapper of glmnet::glmnet() (generalized linear model via penalized maximum likelihood).
q_rf() is a wrapper of ranger::ranger() (random forest). When multiple hyper-parameters are given, the model with the lowest cross-validation error is selected.
q_sl() is a wrapper of SuperLearner::SuperLearner (ensemble model). q_xgboost() is a wrapper of xgboost::xgboost.

Value

q_model object: function with arguments 'AH' (combined action and history matrix) and 'V_res' (residual value/expected utility).

See Also

get_history_names(), get_q_functions().

Examples

library("polle")
### Single stage case
d1 <- sim_single_stage(5e2, seed=1)
pd1 <- policy_data(d1,
                   action="A",
                   covariates=list("Z", "B", "L"),
                   utility="U")
pd1

# available history variable names for the outcome regression:
get_history_names(pd1)

# evaluating the static policy a=1 using inverse
# propensity weighting based on the given Q-model:
pe1 <- policy_eval(type = "or",
                   policy_data = pd1,
                   policy = policy_def(1, name = "A=1"),
                   q_model = q_glm(formula = ~A*.))
pe1

# getting the fitted Q-function values
head(predict(get_q_functions(pe1), pd1))

### Two stages:
d2 <- sim_two_stage(5e2, seed=1)
pd2 <- policy_data(d2,
                  action = c("A_1", "A_2"),
                  covariates = list(L = c("L_1", "L_2"),
                                    C = c("C_1", "C_2")),
                  utility = c("U_1", "U_2", "U_3"))
pd2

# available full history variable names at each stage:
get_history_names(pd2, stage = 1)
get_history_names(pd2, stage = 2)

# evaluating the static policy a=1 using outcome
# regression based on a glm model for each stage:
pe2 <- policy_eval(type = "or",
            policy_data = pd2,
            policy = policy_def(1, reuse = TRUE, name = "A=1"),
            q_model = list(q_glm(~ A * L_1),
                           q_glm(~ A * (L_1 + L_2))),
            q_full_history = TRUE)
pe2

# getting the fitted Q-function values
head(predict(get_q_functions(pe2), pd2))

[Package polle version 1.4 Index]