cgaim {cgaim} | R Documentation |
Constrained groupwise additive index models
Description
Fits constrained groupwise additive index models (CGAIM) to data. CGAIM fits indices subjected to constraints on their coefficients and shape of their association with the outcome. Such constraints can be specified in the formula through g
for grouped terms and s
for smooth covariates.
Usage
cgaim(formula, data, weights, subset, na.action, Cmat = NULL, bvec = NULL,
control = list())
Arguments
formula |
A CGAIM formula with index terms |
data |
A data.frame containing the variables of the model. |
weights |
An optional vector of observation weights. |
subset |
An optional vector specifying a subset of observations to be used in the fitting process. |
na.action |
A function indicating how to treat NAs. The default is set by the |
Cmat |
A constraint matrix for index coefficients alpha. Columns must match all variables entering any index through |
bvec |
A vector of lower bounds for the constraints in |
control |
A list of parameters controlling the fitting process. See |
Details
The CGAIM is expressed
y_{i} = \beta_{0} + \sum_{j} \beta_{j} g_{j}(\alpha_{j}^{T} x_{ij})
+ \sum_{k} \gamma_{k} f_{k}(w_{ik}) + \sum_{l} \theta_{l} u_{il} + e_{i}
where the x_{ij}
are variables entering grouped indices, the w_{ik}
are smooth covariates and the u_{il}
are linear covariates.
The formula interface considers g
to identify index terms, s
for smooth functions and can also include linear terms as usual. All smooth terms can be shape constrained.
The CGAIM allows for linear constraints on the alpha coefficients. Such constraints can be specified through the g
interface in the formula, or through alpha.control$Cmat
. The g
interface is used for constraints meant for a specific index only. In this case, common constraints can easily be specified through the acons
argument (see build_constraints
). Alternatively, more general constraint can be specified by passing a matrix to the Cmat
argument. Constraints encompassing several indices can be specified through an element Cmat
in alpha.control
. Its number of columns must match the total number of index coefficients alpha to estimate. In all cases, arguments bvec
are used to specify the bounds of constraints.
Both indices (g
) and smooth covariate terms (s
) allow shape constraints. See dedicated help for the list of constraints allowed.
The CGAIM is fitted through an iterative algorithm that alternates between estimating the ridge functions g_{j}
(and other non-index terms) and updating the coefficients \alpha_{j}
. The smoothing of ridge functions currently supports three methods: scam
(the default), cgam
and scar
. The list smooth.control
controls the smoothing with allowed parameters defined in cgaim.control
.
Value
A cgaim
object, i.e. a list with components:
alpha |
A named list of index coefficients. |
gfit |
A matrix containing the ridge and smooth functions evaluated at the observations. Note that column ordering puts indices first and covariates after. |
indexfit |
A matrix containing the indices evaluated at the observations. |
beta |
A vector containing the intercept and the scale coefficient of each ridge and smooth function. Includes the |
index |
A vector identifying to which index the columns of the element |
fitted |
A vector of fitted responses. |
residuals |
A vector of residuals. |
rss |
The residual sum of squares of the fit. |
flag |
A flag indicating how the algorithm stopped. 1 for proper convergence, 2 when the algorithm stopped for failing to decrease the RSS and 3 when the maximum number of iterations has been reached. |
niter |
Number of iterations performed. |
edf |
Effective degrees of freedom of the estimator. |
gcv |
Generalized cross validation score. |
dg |
A matrix containing derivatives of ridge and smooth functions. |
gse |
A matrix containing standard errors of ridge and smooth functions. |
active |
A logical vector indicating which constraints are active at convergence. |
Cmat |
The constraint matrix used to fit index coefficients alpha. Will include all constraints given through |
bvec |
The lower bound vector associated with |
x |
A matrix containing the variables entering the indices. The variables are mapped to each index through the element |
y |
The response vector. |
weights |
The weights used for estimation. |
sm_mod |
A list of model elements for the smoothing step of the estimation. Notably includes the matrix |
control |
The control list used to fit the cgaim. |
terms |
The model terms. |
Note
A model without intercept can only be fitted when the smoothing step is performed with scam
.
See Also
confint.cgaim
for confidence intervals,
predict.cgaim
to predict on new data,
plot.cgaim
to plot ridge functions.
Examples
## Simulate some data
n <- 200
x1 <- rnorm(n)
x2 <- rnorm(n)
x3 <- rnorm(n)
x4 <- rnorm(n)
mu <- 4 * exp(8 * x1) / (1 + exp(8 * x1)) + exp(x3)
y <- mu + rnorm(n)
df1 <- data.frame(y, x1, x2, x3, x4)
## Fit an unconstrained the model
ans <- cgaim(y ~ g(x1, x2) + g(x3, x4), data = df1)
# Compute confidence intervals
# In practice, higher B values are warranted
cia <- confint(ans, B = 100)
cia$alpha
cia$beta
# Display ridge functions
plot(ans, ci = cia)
# Predict
newdf <- as.data.frame(matrix(rnorm(100), 25, 4))
names(newdf) <- sprintf("x%i", 1:4)
yhat <- predict(ans, newdf)
## Fit constrained model
ans2 <- cgaim(y ~ g(x1, x2, acons = list(monotone = -1)) +
g(x3, x4, fcons = "cvx"), data = df1)
# Check results
ans2
plot(ans2)
# Same result
Cmat <- as.matrix(Matrix::bdiag(list(build_constraints(2, monotone = -1),
build_constraints(2, first = 1))))
ans3 <- cgaim(y ~ g(x1, x2) + g(x3, x4, fcons = "cvx"), data = df1,
Cmat = Cmat)
## A mis-specified model
ans4 <- cgaim(y ~ g(x1, x2, acons = list(monotone = 1)) +
g(x3, x4, fcons = "dec"), data = df1)