multinomial_ml {ocf} | R Documentation |
Multinomial Machine Learning
Description
Estimation strategy to estimate conditional choice probabilities for ordered non-numeric outcomes.
Usage
multinomial_ml(y = NULL, X = NULL, learner = "forest", scale = TRUE)
Arguments
y |
Outcome vector. |
X |
Covariate matrix (no intercept). |
learner |
String, either |
scale |
Logical, whether to scale the covariates. Ignored if |
Details
Multinomial machine learning expresses conditional choice probabilities as expectations of binary variables:
p_m \left( X_i \right) = \mathbb{E} \left[ 1 \left( Y_i = m \right) | X_i \right]
This allows us to estimate each expectation separately using any regression algorithm to get an estimate of conditional probabilities.
multinomial_ml
combines this strategy with either regression forests or penalized logistic regression with an L1 penalty,
according to the user-specified parameter learner
.
If learner == "l1"
, the penalty parameters are chosen via 10-fold cross-validation
and model.matrix
is used to handle non-numeric covariates. Additionally, if scale == TRUE
, the covariates are scaled to
have zero mean and unit variance.
Value
Object of class mml
.
Author(s)
Riccardo Di Francesco
See Also
Examples
## Load data from orf package.
set.seed(1986)
library(orf)
data(odata)
odata <- odata[1:100, ] # Subset to reduce elapsed time.
y <- as.numeric(odata[, 1])
X <- as.matrix(odata[, -1])
## Training-test split.
train_idx <- sample(seq_len(length(y)), floor(length(y) * 0.5))
y_tr <- y[train_idx]
X_tr <- X[train_idx, ]
y_test <- y[-train_idx]
X_test <- X[-train_idx, ]
## Fit multinomial machine learning on training sample using two different learners.
multinomial_forest <- multinomial_ml(y_tr, X_tr, learner = "forest")
multinomial_l1 <- multinomial_ml(y_tr, X_tr, learner = "l1")
## Predict out of sample.
predictions_forest <- predict(multinomial_forest, X_test)
predictions_l1 <- predict(multinomial_l1, X_test)
## Compare predictions.
cbind(head(predictions_forest), head(predictions_l1))