predict.gpb.Booster {gpboost} | R Documentation |
Prediction function for gpb.Booster
objects
Description
Prediction function for gpb.Booster
objects
Usage
## S3 method for class 'gpb.Booster'
predict(object, data, start_iteration = NULL,
num_iteration = NULL, pred_latent = FALSE, predleaf = FALSE,
predcontrib = FALSE, header = FALSE, reshape = FALSE,
group_data_pred = NULL, group_rand_coef_data_pred = NULL,
gp_coords_pred = NULL, gp_rand_coef_data_pred = NULL,
cluster_ids_pred = NULL, predict_cov_mat = FALSE, predict_var = FALSE,
cov_pars = NULL, ignore_gp_model = FALSE, rawscore = NULL,
vecchia_pred_type = NULL, num_neighbors_pred = NULL, ...)
Arguments
object |
Object of class |
data |
a |
start_iteration |
int or NULL, optional (default=NULL) Start index of the iteration to predict. If NULL or <= 0, starts from the first iteration. |
num_iteration |
int or NULL, optional (default=NULL) Limit number of iterations in the prediction. If NULL, if the best iteration exists and start_iteration is NULL or <= 0, the best iteration is used; otherwise, all iterations from start_iteration are used. If <= 0, all iterations from start_iteration are used (no limits). |
pred_latent |
If TRUE latent variables, both fixed effects (tree-ensemble)
and random effects ( |
predleaf |
whether predict leaf index instead. |
predcontrib |
return per-feature contributions for each record. |
header |
only used for prediction for text file. True if text file has header |
reshape |
whether to reshape the vector of predictions to a matrix form when there are several prediction outputs per case. |
group_data_pred |
A |
group_rand_coef_data_pred |
A |
gp_coords_pred |
A |
gp_rand_coef_data_pred |
A |
cluster_ids_pred |
A |
predict_cov_mat |
A |
predict_var |
A |
cov_pars |
A |
ignore_gp_model |
A |
rawscore |
This is discontinued. Use the renamed equivalent argument
|
vecchia_pred_type |
A |
num_neighbors_pred |
an |
... |
Additional named arguments passed to the |
Value
either a list with vectors or a single vector / matrix depending on
whether there is a gp_model
or not
If there is a gp_model
, the result dict contains the following entries.
1. If pred_latent
is TRUE, the dict contains the following 3 entries:
- result["fixed_effect"] are the predictions from the tree-ensemble.
- result["random_effect_mean"] are the predicted means of the gp_model
.
- result["random_effect_cov"] are the predicted covariances or variances of the gp_model
(only if 'predict_var' or 'predict_cov' is TRUE).
2. If pred_latent
is FALSE, the dict contains the following 2 entries:
- result["response_mean"] are the predicted means of the response variable (Label) taking into account
both the fixed effects (tree-ensemble) and the random effects (gp_model
)
- result["response_var"] are the predicted covariances or variances of the response variable
(only if 'predict_var' or 'predict_cov' is TRUE)
If there is no gp_model
or predcontrib
or ignore_gp_model
are TRUE, the result contains predictions from the tree-booster only.
Author(s)
Fabio Sigrist, authors of the LightGBM R package
Examples
# See https://github.com/fabsig/GPBoost/tree/master/R-package for more examples
library(gpboost)
data(GPBoost_data, package = "gpboost")
#--------------------Combine tree-boosting and grouped random effects model----------------
# Create random effects model
gp_model <- GPModel(group_data = group_data[,1], likelihood = "gaussian")
# The default optimizer for covariance parameters (hyperparameters) is
# Nesterov-accelerated gradient descent.
# This can be changed to, e.g., Nelder-Mead as follows:
# re_params <- list(optimizer_cov = "nelder_mead")
# gp_model$set_optim_params(params=re_params)
# Use trace = TRUE to monitor convergence:
# re_params <- list(trace = TRUE)
# gp_model$set_optim_params(params=re_params)
# Train model
bst <- gpboost(data = X, label = y, gp_model = gp_model, nrounds = 16,
learning_rate = 0.05, max_depth = 6, min_data_in_leaf = 5,
verbose = 0)
# Estimated random effects model
summary(gp_model)
# Make predictions
# Predict latent variables
pred <- predict(bst, data = X_test, group_data_pred = group_data_test[,1],
predict_var = TRUE, pred_latent = TRUE)
pred$random_effect_mean # Predicted latent random effects mean
pred$random_effect_cov # Predicted random effects variances
pred$fixed_effect # Predicted fixed effects from tree ensemble
# Predict response variable
pred_resp <- predict(bst, data = X_test, group_data_pred = group_data_test[,1],
predict_var = TRUE, pred_latent = FALSE)
pred_resp$response_mean # Predicted response mean
# For Gaussian data: pred$random_effect_mean + pred$fixed_effect = pred_resp$response_mean
pred$random_effect_mean + pred$fixed_effect - pred_resp$response_mean
#--------------------Combine tree-boosting and Gaussian process model----------------
# Create Gaussian process model
gp_model <- GPModel(gp_coords = coords, cov_function = "exponential",
likelihood = "gaussian")
# Train model
bst <- gpboost(data = X, label = y, gp_model = gp_model, nrounds = 8,
learning_rate = 0.1, max_depth = 6, min_data_in_leaf = 5,
verbose = 0)
# Estimated random effects model
summary(gp_model)
# Make predictions
pred <- predict(bst, data = X_test, gp_coords_pred = coords_test,
predict_var = TRUE, pred_latent = TRUE)
pred$random_effect_mean # Predicted latent random effects mean
pred$random_effect_cov # Predicted random effects variances
pred$fixed_effect # Predicted fixed effects from tree ensemble
# Predict response variable
pred_resp <- predict(bst, data = X_test, gp_coords_pred = coords_test,
predict_var = TRUE, pred_latent = FALSE)
pred_resp$response_mean # Predicted response mean