var.jack {pls} | R Documentation |
Jackknife Variance Estimates of Regression Coefficients
Description
Calculates jackknife variance or covariance estimates of regression coefficients.
The original (Tukey) jackknife variance estimator is defined as (g-1)/g
\sum_{i=1}^g(\tilde\beta_{-i} - \bar\beta)^2
, where g
is the number
of segments, \tilde\beta_{-i}
is the estimated coefficient when
segment i
is left out (called the jackknife replicates), and
\bar\beta
is the mean of the \tilde\beta_{-i}
. The most common
case is delete-one jackknife, with g = n
, the number of observations.
This is the definition var.jack
uses by default.
However, Martens and Martens (2000) defined the estimator as (g-1)/g
\sum_{i=1}^g(\tilde\beta_{-i} - \hat\beta)^2
, where \hat\beta
is the
coefficient estimate using the entire data set. I.e., they use the original
fitted coefficients instead of the mean of the jackknife replicates. Most
(all?) other jackknife implementations for PLSR use this estimator.
var.jack
can be made to use this definition with use.mean =
FALSE
. In practice, the difference should be small if the number of
observations is sufficiently large. Note, however, that all theoretical
results about the jackknife refer to the ‘proper’ definition. (Also note
that this option might disappear in a future version.)
Usage
var.jack(object, ncomp = object$ncomp, covariance = FALSE, use.mean = TRUE)
Arguments
object |
an |
ncomp |
the number of components to use for estimating the (co)variances |
covariance |
logical. If |
use.mean |
logical. If |
Value
If covariance
is FALSE
, an p\times q \times c
array of variance estimates, where p
is the number of predictors,
q
is the number of responses, and c
is the number of components.
If covariance
id TRUE
, an pq\times pq \times c
array of
variance-covariance estimates.
Warning
Note that the Tukey jackknife variance estimator is not
unbiased for the variance of regression coefficients (Hinkley 1977). The
bias depends on the X
matrix. For ordinary least squares regression
(OLSR), the bias can be calculated, and depends on the number of
observations n
and the number of parameters k
in the mode. For
the common case of an orthogonal design matrix with \pm 1
levels,
the delete-one jackknife estimate equals (n-1)/(n-k)
times the
classical variance estimate for the regression coefficients in OLSR.
Similar expressions hold for delete-d estimates. Modifications have been
proposed to reduce or eliminate the bias for the OLSR case, however, they
depend on the number of parameters used in the model. See e.g. Hinkley
(1977) or Wu (1986).
Thus, the results of var.jack
should be used with caution.
Author(s)
Bjørn-Helge Mevik
References
Tukey J.W. (1958) Bias and Confidence in Not-quite Large Samples. (Abstract of Preliminary Report). Annals of Mathematical Statistics, 29(2), 614.
Martens H. and Martens M. (2000) Modified Jack-knife Estimation of Parameter Uncertainty in Bilinear Modelling by Partial Least Squares Regression (PLSR). Food Quality and Preference, 11, 5–16.
Hinkley D.V. (1977), Jackknifing in Unbalanced Situations. Technometrics, 19(3), 285–292.
Wu C.F.J. (1986) Jackknife, Bootstrap and Other Resampling Methods in Regression Analysis. Te Annals of Statistics, 14(4), 1261–1295.
See Also
Examples
data(oliveoil)
mod <- pcr(sensory ~ chemical, data = oliveoil, validation = "LOO",
jackknife = TRUE)
var.jack(mod, ncomp = 2)