grcv {dglars} | R Documentation |
General Refitted Cross-Validation Estimator
Description
grcv
computes the estimate of the dispersion parameter using the general refitted cross-validation method.
Usage
grcv(object, type = c("BIC", "AIC"), nit = 10L, trace = FALSE,
control = list(), ...)
Arguments
object |
fitted |
type |
the measure of goodness-of-fit used in Step 2 to select the two set of variables (see section Description for more details). Default is |
control |
a list of control parameters passed to the function |
nit |
integer specifying the number of times that the general refitted cross-validation method is repeated (see section Description for more details). Default is |
trace |
flag used to print out information about the algorithm. Default is |
... |
further arguments passed to the functions |
Details
The general refitted cross-validation (grcv) estimator (Pazira et al., 2018) is an estimator of the dispersion parameter of the exponential family based on the following four stage procedure:
Step | Description |
1. | randomly split the data set D = (y, X) into two even datasets, denoted by D_1 and D_2 . |
2. | fit dglars model to the dataset D_1 to select a set of variables A_1 . |
fit dglars model to the dataset D_2 to select a set of variables A_2 . |
|
3. | fit the glm model to the dataset D_1 using the variables that are in A_2 ; then estimate the |
disporsion parameter using the Pearson method. Denote by \hat{\phi}_1(A_2) the resulting estimate. |
|
fit the glm model to the dataset D_2 using the variables that are in A_1 ; then estimate the |
|
disporsion parameter using the Pearson method. Denote by \hat{\phi}_2(A_1) the resulting estimate. |
|
4. | estimate \phi using the following estimator: \hat{\phi}_{grcv} = (\hat{\phi}_1(A_2) + \hat{\phi}_2(A_1)) / 2 .
|
In order to reduce the random variabilty due to the splitting of the dataset (Step 1), the previous procedure is repeated ‘nit
’-times; the median of the resulting estimates is used as final estimate of the dispersion parameter. In Step 3, the two sets of variables are selected using the AIC.dglars
and BIC.dglars
; in this step, the Pearson method is used to obtain a first estimate of the dispersion parameter. Furthermore, if the function glm
does not converge the function dglars
is used to compute the maximum likelihood estimates.
Value
grcv
returns the estimate of the dispersion parameter.
Author(s)
Luigi Augugliaro and Hassan Pazira
Maintainer: Luigi Augugliaro luigi.augugliaro@unipa.it
References
Pazira H., Augugliaro L. and Wit E.C. (2018) <doi:10.1007/s11222-017-9761-7> Extended differential-geometric LARS for high-dimensional GLMs with general dispersion parameter, Statistics and Computing, Vol 28(4), 753-774.
See Also
phihat
, AIC.dglars
and BIC.dglars
.
Examples
############################
# y ~ Gamma
set.seed(321)
n <- 100
p <- 50
X <- matrix(abs(rnorm(n*p)),n,p)
eta <- 1 + 2 * X[,1]
mu <- drop(Gamma()$linkinv(eta))
shape <- 0.5
phi <- 1 / shape
y <- rgamma(n, scale = mu / shape, shape = shape)
fit <- dglars(y ~ X, Gamma("log"))
phi
grcv(fit, type = "AIC")
grcv(fit, type = "BIC")