| refmodel-init-get {projpred} | R Documentation |
Reference model and more general information
Description
Function get_refmodel() is a generic function whose methods usually call
init_refmodel() which is the underlying workhorse (and may also be used
directly without a call to get_refmodel()).
Both, get_refmodel() and init_refmodel(), create an object containing
information needed for the projection predictive variable selection, namely
about the reference model, the submodels, and how the projection should be
carried out. For the sake of simplicity, the documentation may refer to the
resulting object also as "reference model" or "reference model object", even
though it also contains information about the submodels and the projection.
A "typical" reference model object is created by get_refmodel.stanreg() and
brms::get_refmodel.brmsfit(), either implicitly by a call to a top-level
function such as project(), varsel(), and cv_varsel() or explicitly by
a call to get_refmodel(). All non-"typical" reference model objects will be
called "custom" reference model objects.
Some arguments are for K-fold cross-validation (K-fold CV) only;
see cv_varsel() for the use of K-fold CV in projpred.
Usage
get_refmodel(object, ...)
## S3 method for class 'refmodel'
get_refmodel(object, ...)
## S3 method for class 'vsel'
get_refmodel(object, ...)
## S3 method for class 'projection'
get_refmodel(object, ...)
## Default S3 method:
get_refmodel(object, family = NULL, ...)
## S3 method for class 'stanreg'
get_refmodel(object, latent = FALSE, dis = NULL, ...)
init_refmodel(
object,
data,
formula,
family,
ref_predfun = NULL,
div_minimizer = NULL,
proj_predfun = NULL,
extract_model_data = NULL,
cvfun = NULL,
cvfits = NULL,
dis = NULL,
cvrefbuilder = NULL,
called_from_cvrefbuilder = FALSE,
...
)
Arguments
object |
For |
... |
For |
family |
An object of class |
latent |
A single logical value indicating whether to use the latent
projection ( |
dis |
A vector of posterior draws for the reference model's dispersion
parameter or—more precisely—the posterior values for the reference
model's parameter-conditional predictive variance (assuming that this
variance is the same for all observations). May be |
data |
A |
formula |
The full formula to use for the search procedure. For custom
reference models, this does not necessarily coincide with the reference
model's formula. For general information about formulas in R, see
|
ref_predfun |
Prediction function for the linear predictor of the
reference model, including offsets (if existing). See also section
"Arguments |
div_minimizer |
A function for minimizing the Kullback-Leibler (KL)
divergence from the reference model to a submodel (i.e., for performing the
projection of the reference model onto a submodel). The output of
|
proj_predfun |
Prediction function for the linear predictor of a
submodel onto which the reference model is projected. See also section
"Arguments |
extract_model_data |
A function for fetching some variables (response,
observation weights, offsets) from the original dataset (supplied to
argument |
cvfun |
For |
cvfits |
For |
cvrefbuilder |
For |
called_from_cvrefbuilder |
A single logical value indicating whether
|
Value
An object that can be passed to all the functions that take the
reference model fit as the first argument, such as varsel(),
cv_varsel(), project(), proj_linpred(), and proj_predict().
Usually, the returned object is of class refmodel. However, if object
is NULL, the returned object is of class datafit as well as of class
refmodel (with datafit being first). Objects of class datafit are
handled differently at several places throughout this package.
The elements of the returned object are not meant to be accessed directly
but instead via downstream functions (see the functions mentioned above as
well as predict.refmodel()).
Formula terms
Although bad practice (in general), a reference model lacking an intercept can be used within projpred. However, it will always be projected onto submodels which include an intercept. The reason is that even if the true intercept in the reference model is zero, this does not need to hold for the submodels.
In multilevel (group-level) terms, function calls on the right-hand side of
the | character (e.g., (1 | gr(group_variable)), which is possible in
brms) are currently not allowed in projpred.
For additive models (still an experimental feature), only mgcv::s() and
mgcv::t2() are currently supported as smooth terms. Furthermore, these need
to be called without any arguments apart from the predictor names (symbols).
For example, for smoothing the effect of a predictor x, only s(x) or
t2(x) are allowed. As another example, for smoothing the joint effect of
two predictors x and z, only s(x, z) or t2(x, z) are allowed (and
analogously for higher-order joint effects, e.g., of three predictors). Note
that all smooth terms need to be included in formula (there is no random
argument as in rstanarm::stan_gamm4(), for example).
Arguments ref_predfun, proj_predfun, and div_minimizer
Arguments ref_predfun, proj_predfun, and div_minimizer may be NULL
for using an internal default (see projpred-package for the functions used
by the default divergence minimizers). Otherwise, let N denote the
number of observations (in case of CV, these may be reduced to each fold),
S_{\mathrm{ref}} the number of posterior draws for the reference
model's parameters, and S_{\mathrm{prj}} the number of draws for
the parameters of a submodel that the reference model has been projected onto
(short: the number of projected draws). For the augmented-data projection,
let C_{\mathrm{cat}} denote the number of response categories,
C_{\mathrm{lat}} the number of latent response categories (which
typically equals C_{\mathrm{cat}} - 1), and define
N_{\mathrm{augcat}} := N \cdot C_{\mathrm{cat}}
as well as N_{\mathrm{auglat}} := N \cdot C_{\mathrm{lat}}. Then the functions supplied to these arguments need to have the
following prototypes:
-
ref_predfun:ref_predfun(fit, newdata = NULL)where:-
fitaccepts the reference model fit as given in argumentobject(but possibly refitted to a subset of the observations, as done inK-fold CV). -
newdataaccepts eitherNULL(for using the original dataset, typically stored infit) or data for new observations (at least in the form of adata.frame).
-
-
proj_predfun:proj_predfun(fits, newdata)where:-
fitsaccepts alistof lengthS_{\mathrm{prj}}containing this number of submodel fits. Thislistis the same as that returned byproject()in its output elementoutdmin(which in turn is the same as the return value ofdiv_minimizer, except ifproject()was used with anobjectof classvselbased on an L1 search as well as withrefit_prj = FALSE). -
newdataaccepts data for new observations (at least in the form of adata.frame).
-
-
div_minimizerdoes not need to have a specific prototype, but it needs to be able to be called with the following arguments:-
formulaaccepts either a standardformulawith a single response (ifS_{\mathrm{prj}} = 1or in case of the augmented-data projection) or aformulawithS_{\mathrm{prj}} > 1response variablescbind()-ed on the left-hand side in which case the projection has to be performed for each of the response variables separately. -
dataaccepts adata.frameto be used for the projection. In case of the traditional or the latent projection, this dataset hasNrows. In case of the augmented-data projection, this dataset hasN_{\mathrm{augcat}}rows. -
familyaccepts an object of classfamily. -
weightsaccepts either observation weights (at least in the form of a numeric vector) orNULL(for using a vector of ones as weights). -
projpred_varaccepts anN \times S_{\mathrm{prj}}matrix of predictive variances (necessary for projpred's internal GLM fitter) in case of the traditional or the latent projection and anN_{\mathrm{augcat}} \times S_{\mathrm{prj}}matrix (containing onlyNAs) in case of the augmented-data projection. -
projpred_ws_augaccepts anN \times S_{\mathrm{prj}}matrix of expected values for the response in case of the traditional or the latent projection and anN_{\mathrm{augcat}} \times S_{\mathrm{prj}}matrix of probabilities for the response categories in case of the augmented-data projection. -
...accepts further arguments specified by the user (or by projpred).
-
The return value of these functions needs to be:
-
ref_predfun: for the traditional or the latent projection, anN \times S_{\mathrm{ref}}matrix; for the augmented-data projection, anS_{\mathrm{ref}} \times N \times C_{\mathrm{lat}}array (the only exception is the augmented-data projection for thebinomial()family in which caseref_predfunneeds to return anN \times S_{\mathrm{ref}}matrix just like for the traditional projection because the array is constructed by an internal wrapper function). -
proj_predfun: for the traditional or the latent projection, anN \times S_{\mathrm{prj}}matrix; for the augmented-data projection, anN \times C_{\mathrm{lat}} \times S_{\mathrm{prj}}array. -
div_minimizer: alistof lengthS_{\mathrm{prj}}containing this number of submodel fits.
Argument extract_model_data
The function supplied to argument extract_model_data needs to have the
prototype
extract_model_data(object, newdata, wrhs = NULL, orhs = NULL,
extract_y = TRUE)
where:
-
objectaccepts the reference model fit as given in argumentobject(but possibly refitted to a subset of the observations, as done inK-fold CV). -
newdataaccepts data for new observations (at least in the form of adata.frame). -
wrhsaccepts at least (i) a right-hand side formula consisting only of the variable innewdatacontaining the observation weights or (ii)NULLfor using the observation weights corresponding tonewdata(typically, the observation weights are stored in a column ofnewdata; if the model was fitted without observation weights, a vector of ones should be used). -
orhsaccepts at least (i) a right-hand side formula consisting only of the variable innewdatacontaining the offsets or (ii)NULLfor using the offsets corresponding tonewdata(typically, the offsets are stored in a column ofnewdata; if the model was fitted without offsets, a vector of zeros should be used). -
extract_yaccepts a single logical value indicating whether output elementy(see below) shall beNULL(TRUE) or not (FALSE).
The return value of extract_model_data needs to be a list with elements
y, weights, and offset, each being a numeric vector containing the data
for the response, the observation weights, and the offsets, respectively. An
exception is that y may also be NULL (depending on argument extract_y),
a non-numeric vector, or a factor.
The weights and offsets returned by extract_model_data will be assumed to
hold for the reference model as well as for the submodels.
Above, arguments wrhs and orhs were assumed to have defaults of NULL.
It should be possible to use defaults other than NULL, but we strongly
recommend to use NULL. If defaults other than NULL are used, they need to
imply the behaviors described at items "(ii)" (see the descriptions of wrhs
and orhs).
Augmented-data projection
If a custom reference model for an augmented-data projection is needed, see
also extend_family().
For the augmented-data projection, the response vector resulting from
extract_model_data is internally coerced to a factor (using
as.factor()). The levels of this factor have to be identical to
family$cats (after applying extend_family() internally; see
extend_family()'s argument augdat_y_unqs).
Note that response-specific offsets (i.e., one length-N offset vector
per response category) are not supported by projpred yet. So far, only
offsets which are the same across all response categories are supported. This
is why in case of the brms::categorical() family, offsets are currently not
supported at all.
Currently, object = NULL (i.e., a datafit; see section "Value") is not
supported in case of the augmented-data projection.
Latent projection
If a custom reference model for a latent projection is needed, see also
extend_family().
For the latent projection, family$cats (after applying extend_family()
internally; see extend_family()'s argument latent_y_unqs) currently must
not be NULL if the original (i.e., non-latent) response is a factor.
Conversely, if family$cats (after applying extend_family()) is
non-NULL, the response vector resulting from extract_model_data is
internally coerced to a factor (using as.factor()). The levels of this
factor have to be identical to that non-NULL element family$cats.
Currently, object = NULL (i.e., a datafit; see section "Value") is not
supported in case of the latent projection.
Examples
# Data:
dat_gauss <- data.frame(y = df_gaussian$y, df_gaussian$x)
# The `stanreg` fit which will be used as the reference model (with small
# values for `chains` and `iter`, but only for technical reasons in this
# example; this is not recommended in general):
fit <- rstanarm::stan_glm(
y ~ X1 + X2 + X3 + X4 + X5, family = gaussian(), data = dat_gauss,
QR = TRUE, chains = 2, iter = 500, refresh = 0, seed = 9876
)
# Define the reference model object explicitly:
ref <- get_refmodel(fit)
print(class(ref)) # gives `"refmodel"`
# Now see, for example, `?varsel`, `?cv_varsel`, and `?project` for
# possible post-processing functions. Most of the post-processing functions
# call get_refmodel() internally at the beginning, so you will rarely need
# to call get_refmodel() yourself.
# A custom reference model object which may be used in a variable selection
# where the candidate predictors are not a subset of those used for the
# reference model's predictions:
ref_cust <- init_refmodel(
fit,
data = dat_gauss,
formula = y ~ X6 + X7,
family = gaussian(),
cvfun = function(folds) {
kfold(
fit, K = max(folds), save_fits = TRUE, folds = folds, cores = 1
)$fits[, "fit"]
},
dis = as.matrix(fit)[, "sigma"],
cvrefbuilder = function(cvfit) {
init_refmodel(cvfit,
data = dat_gauss[-cvfit$omitted, , drop = FALSE],
formula = y ~ X6 + X7,
family = gaussian(),
dis = as.matrix(cvfit)[, "sigma"],
called_from_cvrefbuilder = TRUE)
}
)
# Now, the post-processing functions mentioned above (for example,
# varsel(), cv_varsel(), and project()) may be applied to `ref_cust`.