ib {ib} | R Documentation |
Bias correction via iterative bootstrap
Description
ib
is used to correct the bias of a fitted model object
with the iterative bootstrap procedure.
Usage
ib(object, thetastart = NULL, control = list(...), extra_param = FALSE, ...)
## S4 method for signature 'betareg'
ib(object, thetastart = NULL, control = list(...), extra_param = FALSE, ...)
## S4 method for signature 'glm'
ib(object, thetastart = NULL, control = list(...), extra_param = FALSE, ...)
## S4 method for signature 'lm'
ib(object, thetastart = NULL, control = list(...), extra_param = FALSE, ...)
## S4 method for signature 'lmerMod'
ib(object, thetastart = NULL, control = list(...), extra_param = FALSE, ...)
## S4 method for signature 'nls'
ib(object, thetastart = NULL, control = list(...), extra_param = FALSE, ...)
## S4 method for signature 'vglm'
ib(object, thetastart = NULL, control = list(...), extra_param = FALSE, ...)
Arguments
object |
an |
thetastart |
an optional starting value for the iterative procedure.
If |
control |
a |
extra_param |
if |
... |
additional optional arguments (see 'Details'). |
Details
The iterative bootstrap procedure is described in Kuk (1995) and further studied by Guerrier et al. (2019) and Guerrier et al. (2020). The kth iteration of this algorithm is
\hat{\theta}^{k} = \hat{\theta}^{k-1} + \hat{\pi} -
\frac{1}{H}\sum_{h=1}^H\hat{\pi}_h(\hat{\theta}^{k-1})
for k=1,2,\ldots
and where the sum is over h=1,\ldots,H
.
The estimate \hat{\pi}
is provided by the object
.
The value \hat{\pi}_h(\hat{\theta})
is a parametric bootstrap
estimate where the bootstrap sample is generated from \hat{\theta}
and a fixed seed
(see ibControl
).
The greater the parameter value H
generally the better bias correction
but the more computation it requires (see ibControl
).
If thetastart=NULL
, the initial value of the procedure is \hat{\theta}^{0}=\hat{\pi}
.
The number of iterations are controlled by maxit
parameter of ibControl
.
By default, the method correct coefficients
only. For
extra parameters, it depends on the model. These extra parameters may have
some constraints (e.g. positivity). If constraint=TRUE
(see
ibControl
), then a transformation from the constraint space to the
real is used for the update.
For betareg, extra_param
is not available
as by default mean and precision parameters are corrected.
Currently the 'identity' link function is not supported for precision
parameters.
For glm, if extra_param=TRUE
: the shape parameter for the
Gamma
, the variance of the residuals in lm
or
the overdispersion parameter of the negative binomial regression in glm.nb
,
are also corrected. Note that the quasi
families
are not supported for the moment as they have no simulation method
(see simulate
). Bias correction for extra parameters
of the inverse.gaussian
is not yet implemented.
For lm, if extra_param=TRUE
: the variance of the residuals is
also corrected. Note that using the ib
is not useful as coefficients
are already unbiased, unless one considers different
data generating mechanism such as censoring, missing values
and outliers (see ibControl
).
For lmer
, by default, only the fixed effects are corrected.
If extra_param=TRUE
: all the random effects
(variances and correlations) and the variance
of the residuals are also corrected.
Note that using the ib
is
certainly not useful with the argument REML=TRUE
in
lmer
as the bias of variance components is
already addressed, unless one considers different
data generating mechanism such as censoring, missing values
and outliers (see ibControl
).
For nls, if extra_param=TRUE
: the variance of the residuals is
also corrected.
For vglm, extra_param
is currently not used.
Indeed, the philosophy of a vector generalized linear model is to
potentially model all parameters of a distribution with a linear predictor.
Hence, what would be considered as an extra parameter in glm
for instance, may already be captured by the default coefficients
.
However, correcting the bias of a coefficients
does not imply
that the bias of the parameter of the distribution is corrected
(by Jensen's inequality),
so we may use this feature in a future version of the package.
Note that we currently only support distributions
with a simslot
(see simulate.vlm
).
Value
A fitted model object
of class Ib.
Author(s)
Samuel Orso
References
Guerrier S, Dupuis-Lozeron E, Ma Y, Victoria-Feser M (2019).
“Simulation-Based Bias Correction Methods for Complex Models.”
Journal of the American Statistical Association, 114(525), 146-157.
doi: 10.1080/01621459.2017.1380031, https://doi.org/10.1080/01621459.2017.1380031.
Guerrier S, Karemera M, Orso S, Victoria-Feser M, Zhang Y (2020).
“A General Approach for Simulation-based Bias Correction in High Dimensional Settings.”
https://arxiv.org/pdf/2010.13687.pdf.
Version 2: 13 Nov 2020, 2010.13687, https://arxiv.org/pdf/2010.13687.pdf.
Kuk AYC (1995).
“Asymptotically Unbiased Estimation in Generalized Linear Models with Random Effects.”
Journal of the Royal Statistical Society: Series B (Methodological), 57(2), 395-407.
doi: 10.1111/j.2517-6161.1995.tb02035.x, https://rss.onlinelibrary.wiley.com/doi/pdf/10.1111/j.2517-6161.1995.tb02035.x, https://rss.onlinelibrary.wiley.com/doi/abs/10.1111/j.2517-6161.1995.tb02035.x.
See Also
Examples
## beta regression
library(betareg)
data("GasolineYield", package = "betareg")
## currently link.phi = "identity" is not supported
## fit_beta <- betareg(yield ~ batch + temp, data = GasolineYield)
fit_beta <- betareg(yield ~ batch + temp, link.phi = "log", data = GasolineYield)
fit_ib <- ib(fit_beta)
# precision parameter can also depend on covariates
fit_beta <- betareg(yield ~ batch + temp | temp, data = GasolineYield)
fit_ib <- ib(fit_beta)
## poisson regression
counts <- c(18,17,15,20,10,20,25,13,12)
outcome <- gl(3,1,9)
treatment <- gl(3,3)
pois_fit <- glm(counts ~ outcome + treatment, family = poisson())
fit_ib <- ib(pois_fit)
summary(fit_ib)
## Set H = 1000
## Not run:
fit_ib <- ib(pois_fit, control=list(H=1000))
summary(fit_ib)
## End(Not run)
## gamma regression
clotting <- data.frame(
u = c(5,10,15,20,30,40,60,80,100),
lot1 = c(118,58,42,35,27,25,21,19,18),
lot2 = c(69,35,26,21,18,16,13,12,12))
fit_gamma <- glm(lot2 ~ log(u), data = clotting, family = Gamma(link = "inverse"))
fit_ib <- ib(fit_gamma)
## summary(fit_ib)
## correct for shape parameter and show iterations
## Not run:
fit_ib <- ib(fit_gamma, control=list(verbose=TRUE), extra_param = TRUE)
summary(fit_ib)
## End(Not run)
## negative binomial regression
library(MASS)
fit_nb <- glm.nb(Days ~ Sex/(Age + Eth*Lrn), data = quine)
fit_ib <- ib(fit_nb)
## summary(fit_ib)
## correct for overdispersion with H=100
## Not run:
fit_ib <- ib(fit_nb, control=list(H=100), extra_param = TRUE)
summary(fit_ib)
## End(Not run)
## linear regression
fit_lm <- lm(disp ~ cyl + hp + wt, data = mtcars)
fit_ib <- ib(fit_lm)
summary(fit_ib)
## correct for variance of residuals
fit_ib <- ib(fit_lm, extra_param = TRUE)
summary(fit_ib)
## linear mixed-effects regression
library(lme4)
fit_lmm <- lmer(Reaction ~ Days + (Days | Subject), data = sleepstudy, REML = FALSE)
fit_ib <- ib(fit_lmm)
summary(fit_ib)
## correct for variances and correlation
## Not run:
fit_ib <- ib(fit_lmm, extra_param = TRUE)
summary(fit_ib)
## End(Not run)
## nonlinear regression
DNase1 <- subset(DNase, Run == 1)
fit_nls <- nls(density ~ SSlogis(log(conc), Asym, xmid, scal), data = DNase1)
fit_ib <- ib(fit_nls)
summary(fit_ib)
## student regression
library(VGAM)
tdata <- data.frame(x = runif(nn <- 1000))
tdata <- transform(tdata,
y = rt(nn, df = exp(exp(0.5 - x))))
fit_vglm <- vglm(y ~ x, studentt3, data = tdata)
fit_ib <- ib(fit_vglm)
summary(fit_ib)