anlvm.fit {skedastic} | R Documentation |
Auxiliary Nonlinear Variance Model
Description
Fits an Auxiliary Nonlinear Variance Model (ANLVM) to estimate the error variances of a heteroskedastic linear regression model.
Usage
anlvm.fit(
mainlm,
g,
M = NULL,
cluster = FALSE,
varselect = c("none", "hettest", "cv.linear", "cv.cluster", "qgcv.linear",
"qgcv.cluster"),
nclust = c("elbow.swd", "elbow.mwd", "elbow.both"),
clustering = NULL,
param.init = function(q) stats::runif(n = q, min = -5, max = 5),
maxgridrows = 20L,
nconvstop = 3L,
zerosallowed = FALSE,
maxitql = 100L,
tolql = 1e-08,
nestedql = FALSE,
reduce2homosked = TRUE,
cvoption = c("testsetols", "partitionres"),
nfolds = 5L,
...
)
Arguments
mainlm |
Either an object of |
g |
A numeric-valued function of one variable, or a character denoting
the name of such a function. |
M |
An |
cluster |
A logical; should the design matrix X be replaced with an
|
varselect |
Either a character indicating how variable selection should
be conducted, or an integer vector giving indices of columns of the
predictor matrix (
|
nclust |
A character indicating which elbow method to use to select
the number of clusters (ignored if |
clustering |
A list object of class |
param.init |
Specifies the initial values of the parameter vector to
use in the Gauss-Newton fitting algorithm. This can either be a function
for generating the initial values from a probability distribution, a
list containing named objects corresponding to the arguments of
|
maxgridrows |
An integer indicating the maximum number of initial
values of the parameter vector to try, in case of |
nconvstop |
An integer indicating how many times the quasi-likelihood
estimation algorithm should converge before the grid search across
different initial parameter values is truncated. Defaults to |
zerosallowed |
A logical indicating whether 0 values are acceptable
in the initial values of the parameter vector. Defaults to |
maxitql |
An integer specifying the maximum number of iterations to
run in the Gauss-Newton algorithm for quasi-likelihood estimation.
Defaults to |
tolql |
A double specifying the convergence criterion for the
Gauss-Newton algorithm; defaults to |
nestedql |
A logical indicating whether to use the nested updating step
suggested in Seber and Wild (2003). Defaults to
|
reduce2homosked |
A logical indicating whether the homoskedastic
error variance estimator |
cvoption |
A character, either |
nfolds |
An integer specifying the number of folds |
... |
Other arguments that can be passed to (non-exported) helper functions, namely:
|
Details
The ANLVM model equation is
e_i^2=\displaystyle\sum_{k=1}^{n} g(X_{k\cdot}'\gamma) m_{ik}^2+u_i
,
where e_i
is the i
th Ordinary Least Squares residual,
X_{k\cdot}
is a vector corresponding to the k
th row of the
n\times p
design matrix X
, m_{ik}^2
is the
(i,k)
th element of the annihilator matrix M=I-X(X'X)^{-1}X'
,
u_i
is a random error term, \gamma
is a p
-vector of
unknown parameters, and g(\cdot)
is a continuous, differentiable
function that need not be linear in \gamma
, but must be expressible
as a function of the linear predictor X_{k\cdot}'\gamma
.
This method has been developed as part of the author's doctoral research
project.
The parameter vector \gamma
is estimated using the maximum
quasi-likelihood method as described in section 2.3 of
Seber and Wild (2003). The optimisation problem is
solved numerically using a Gauss-Newton algorithm.
For further discussion of feature selection and the methods for choosing the
number of clusters to use with the clustering version of the model, see
alvm.fit
.
Value
An object of class "anlvm.fit"
, containing the following:
-
coef.est
, a vector of parameter estimates,\hat{\gamma}
-
var.est
, a vector of estimates\hat{\omega}
of the error variances for all observations -
method
, either"cluster"
or"functionalform"
, depending on whethercluster
was set toTRUE
-
ols
, thelm
object corresponding to the original linear regression model -
fitinfo
, a list containing three named objects,g
(the heteroskedastic function),Msq
(the elementwise-square of the annihilator matrixM
),Z
(the design matrix used in the ANLVM, after feature selection if applicable), andclustering
(a list object with results of the clustering procedure, if applicable). -
selectinfo
, a list containing two named objects,varselect
(the value of the eponymous argument), andselectedcols
(a numeric vector with column indices ofX
that were selected, with1
denoting the intercept column) -
qlinfo
, a list containing nine named objects:converged
(a logical, indicating whether the Gauss-Newton algorithm converged for at least one initial value of the parameter vector),iterations
(the number of Gauss-Newton iterations used to obtain the parameter estimates returned),Smin
(the minimum achieved value of the objective function used in the Gauss-Newton routine), and six arguments passed to the function (nested
,param.init
,maxgridrows
,nconvstop
,maxitql
, andtolql
)
References
Seber GAF, Wild CJ (2003). Nonlinear Regression. Wiley, Hoboken, NJ.
See Also
Examples
mtcars_lm <- lm(mpg ~ wt + qsec + am, data = mtcars)
myanlvm <- anlvm.fit(mtcars_lm, g = function(x) x ^ 2,
varselect = "qgcv.linear")