dindexm {isodistrreg} | R Documentation |
Distributional index model (DIM)
Description
Fits distributional index model with user-specified index function to training dataset. See the examples at the bottom to learn how to specify a distributional single index model.
Usage
dindexm(
formula,
indexfit,
data,
response,
pars = osqpSettings(verbose = FALSE, eps_abs = 1e-05, eps_rel = 1e-05, max_iter =
10000L),
progress = TRUE,
...
)
Arguments
formula |
object of class |
indexfit |
function that fits the index model to training data. Should
accept arguments |
data |
|
response |
name of the response variable in |
pars |
parameters for quadratic programming optimization (only relevant
for multivariate index functions), set using
|
progress |
display progressbar for fitting idr? |
... |
further arguments passed to |
Details
This function fits a distributional index model (DIM) to training data. The
DIM assumes that the response is more likely to attain higher values when the
values of the index function increases. The index function can be
estimated by parametric methods like lm
or
glm
or also nonparametrically.
The formal mathematical assumption of the DIM is that the conditional CDFs
F_{y | g(X) = g(x)}(z)
at each fixed threshold z decreases, as
g(x) increases. Here y
denotes the response, x
, X
are the covariates in data
and g
is the index function
estimated by indexfit
.
Estimation is performed in two steps: indexfit
is applied to
data
to estimate the function g
. With this estimate,
idr
is applied with the pseudo-covariates g(x)
and
response y
.
Value
Object of class dindexm
: A list containing the index model (first
component) and the IDR fit on the pseudo-data with the index as covariate
(second component).
References
Henzi, A., Kleger, G. R., & Ziegel, J. F. (2020). Distributional (Single) Index Models. arXiv preprint arXiv:2006.09219.
See Also
idr
for more information on IDR,
predict.dindexfit
for (out-of-sample) predictions based on a
model with with dindexm
.
Examples
n <- 1000
X <- data.frame(x1 = rnorm(n), x2 = rnorm(n), x3 = rnorm(n))
y <- rnorm(n, 1 - X[, 1] + X[, 2]^2 / 3 - (1 - X[, 3]) * (1 + X[, 3]) / 2)
data <- cbind(y = y, as.data.frame(X))
## data for out-of-sample prediction
newX <- data.frame(x1 = rnorm(10), x2 = rnorm(10), x3 = rnorm(10))
## linear regression model for index
model <- dindexm(
formula = y ~ poly(x1, degree = 2) + poly(x2, degree = 2) +
poly(x3, degree = 2),
indexfit = lm,
response = "y",
data = data
)
pred <- predict(model, data = newX)
## plot
plot(pred, 1, main = "LM based DIM")
grd <- pred[[1]]$points
trueCdf <- pnorm(
grd,
1 - newX[1, 1] + newX[1, 2]^2 / 3 - (1 - newX[1, 3]) * (1 + newX[1, 3]) / 2
)
points(grd, trueCdf, type = "l", col = 2)