| anlvm.fit {skedastic} | R Documentation |
Auxiliary Nonlinear Variance Model
Description
Fits an Auxiliary Nonlinear Variance Model (ANLVM) to estimate the error variances of a heteroskedastic linear regression model.
Usage
anlvm.fit(
mainlm,
g,
M = NULL,
cluster = FALSE,
varselect = c("none", "hettest", "cv.linear", "cv.cluster", "qgcv.linear",
"qgcv.cluster"),
nclust = c("elbow.swd", "elbow.mwd", "elbow.both"),
clustering = NULL,
param.init = function(q) stats::runif(n = q, min = -5, max = 5),
maxgridrows = 20L,
nconvstop = 3L,
zerosallowed = FALSE,
maxitql = 100L,
tolql = 1e-08,
nestedql = FALSE,
reduce2homosked = TRUE,
cvoption = c("testsetols", "partitionres"),
nfolds = 5L,
...
)
Arguments
mainlm |
Either an object of |
g |
A numeric-valued function of one variable, or a character denoting
the name of such a function. |
M |
An |
cluster |
A logical; should the design matrix X be replaced with an
|
varselect |
Either a character indicating how variable selection should
be conducted, or an integer vector giving indices of columns of the
predictor matrix (
|
nclust |
A character indicating which elbow method to use to select
the number of clusters (ignored if |
clustering |
A list object of class |
param.init |
Specifies the initial values of the parameter vector to
use in the Gauss-Newton fitting algorithm. This can either be a function
for generating the initial values from a probability distribution, a
list containing named objects corresponding to the arguments of
|
maxgridrows |
An integer indicating the maximum number of initial
values of the parameter vector to try, in case of |
nconvstop |
An integer indicating how many times the quasi-likelihood
estimation algorithm should converge before the grid search across
different initial parameter values is truncated. Defaults to |
zerosallowed |
A logical indicating whether 0 values are acceptable
in the initial values of the parameter vector. Defaults to |
maxitql |
An integer specifying the maximum number of iterations to
run in the Gauss-Newton algorithm for quasi-likelihood estimation.
Defaults to |
tolql |
A double specifying the convergence criterion for the
Gauss-Newton algorithm; defaults to |
nestedql |
A logical indicating whether to use the nested updating step
suggested in Seber and Wild (2003). Defaults to
|
reduce2homosked |
A logical indicating whether the homoskedastic
error variance estimator |
cvoption |
A character, either |
nfolds |
An integer specifying the number of folds |
... |
Other arguments that can be passed to (non-exported) helper functions, namely:
|
Details
The ANLVM model equation is
e_i^2=\displaystyle\sum_{k=1}^{n} g(X_{k\cdot}'\gamma) m_{ik}^2+u_i
,
where e_i is the ith Ordinary Least Squares residual,
X_{k\cdot} is a vector corresponding to the kth row of the
n\times p design matrix X, m_{ik}^2 is the
(i,k)th element of the annihilator matrix M=I-X(X'X)^{-1}X',
u_i is a random error term, \gamma is a p-vector of
unknown parameters, and g(\cdot) is a continuous, differentiable
function that need not be linear in \gamma, but must be expressible
as a function of the linear predictor X_{k\cdot}'\gamma.
This method has been developed as part of the author's doctoral research
project.
The parameter vector \gamma is estimated using the maximum
quasi-likelihood method as described in section 2.3 of
Seber and Wild (2003). The optimisation problem is
solved numerically using a Gauss-Newton algorithm.
For further discussion of feature selection and the methods for choosing the
number of clusters to use with the clustering version of the model, see
alvm.fit.
Value
An object of class "anlvm.fit", containing the following:
-
coef.est, a vector of parameter estimates,\hat{\gamma} -
var.est, a vector of estimates\hat{\omega}of the error variances for all observations -
method, either"cluster"or"functionalform", depending on whetherclusterwas set toTRUE -
ols, thelmobject corresponding to the original linear regression model -
fitinfo, a list containing three named objects,g(the heteroskedastic function),Msq(the elementwise-square of the annihilator matrixM),Z(the design matrix used in the ANLVM, after feature selection if applicable), andclustering(a list object with results of the clustering procedure, if applicable). -
selectinfo, a list containing two named objects,varselect(the value of the eponymous argument), andselectedcols(a numeric vector with column indices ofXthat were selected, with1denoting the intercept column) -
qlinfo, a list containing nine named objects:converged(a logical, indicating whether the Gauss-Newton algorithm converged for at least one initial value of the parameter vector),iterations(the number of Gauss-Newton iterations used to obtain the parameter estimates returned),Smin(the minimum achieved value of the objective function used in the Gauss-Newton routine), and six arguments passed to the function (nested,param.init,maxgridrows,nconvstop,maxitql, andtolql)
References
Seber GAF, Wild CJ (2003). Nonlinear Regression. Wiley, Hoboken, NJ.
See Also
Examples
mtcars_lm <- lm(mpg ~ wt + qsec + am, data = mtcars)
myanlvm <- anlvm.fit(mtcars_lm, g = function(x) x ^ 2,
varselect = "qgcv.linear")