robustbetareg {robustbetareg}R Documentation

Robust Beta Regression

Description

Fit robust beta regression models for rates and proportions via LSMLE, LMDPDE, SMLE and MDPDE. Both mean and precision of the response variable are modeled through parametric functions.

Usage

robustbetareg(
  formula,
  data,
  alpha,
  type = c("LSMLE", "LMDPDE", "SMLE", "MDPDE"),
  link = c("logit", "probit", "cloglog", "cauchit", "loglog"),
  link.phi = NULL,
  control = robustbetareg.control(...),
  model = TRUE,
  ...
)

LMDPDE.fit(y, x, z, alpha = NULL, link = "logit",
link.phi = "log", control = robustbetareg.control(...), ...)

LSMLE.fit(y, x, z, alpha = NULL, link = "logit",
link.phi = "log", control = robustbetareg.control(...), ...)

MDPDE.fit(y, x, z, alpha = NULL, link = "logit",
link.phi = "log", control = robustbetareg.control(...), ...)

SMLE.fit(y, x, z, alpha = NULL, link = "logit",
link.phi = "log", control = robustbetareg.control(...), ...)

Arguments

formula

symbolic description of the model. See Details for further information.

data

dataset to be used.

alpha

numeric in [0,1) indicating the value of the tuning constant alpha. alpha = 0 leads to the maximum likelihood estimator. Robust procedures require alpha greater than zero. If this argument is suppressed, the tuning constant will be selected automatically through the data-driven algorithm proposed by Ribeiro and Ferrari (2022).

type

character specifying the type of robust estimator to be used in the estimation process. Supported estimators are "LSMLE" , "LMDPDE", "SMLE", and "MDPDE"; for details, see Maluf et al. (2022). The "LSMLE" is the default.

link

an optional character that specifies the link function of the mean submodel (mu). The "logit", "probit", "cloglog", "cauchit", "loglog" functions are supported. The logit function is the default.

link.phi

an optional character that specifies the link function of the precision submodel (phi). The "identity", "log", "sqrt" functions are supported. The default is log unless formula is of type y ~ x where the default is "identity".

control

a list of control arguments specified via robustbetareg.control.

model

logical. If TRUE the corresponding components of the fit (model frame, response, model matrix) are returned.

...

argument to be passed to robustbetareg.control.

y, x, z

y must be a numeric response vector (with values in (0,1)), x must be a numeric regressor matrix for the mean submodel, and z must be a numeric regressor matrix for the precision submodel.

Details

Beta regression models are employed to model continuous response variables in the unit interval, like rates and proportions. The maximum likelihood-based inference suffers from the lack of robustness in the presence of outliers. Based on the density power divergence, Ghosh (2019) proposed the minimum density power divergence estimator (MDPDE). Ribeiro and Ferrari (2022) proposed an estimator based on the maximization of a reparameterized Lq-likelihood; it is called SMLE. These estimators require suitable restrictions in the parameter space. Maluf et al. (2022) proposed robust estimators based on the MDPDE and the SMLE which have the advantage of overcoming this drawback. These estimators are called LMDPDE and LSMLE. For details, see the cited works. The four estimators are implemented in the robustbetareg function. They depend on a tuning constant (called \alpha). When the tuning constant is fixed and equal to 0, all of the estimators coincide with the maximum likelihood estimator. Ribeiro and Ferrari (2022) and Maluf et al. (2022) suggest using a data-driven algorithm to select the optimum value of \alpha. This algorithm is implemented in robustbetareg by default when the argument "alpha" is suppressed.

The formulation of the model has the same structure as in the usual functions glm and betareg. The argument formula can comprise of three parts (separated by the symbols "~" and "|"), namely: observed response variable in the unit interval, predictor of the mean submodel, with link function link and predictor of the precision submodel, with link.phi link function. If the model has constant precision, the third part may be omitted and the link function for phi is "identity" by default. The tuning constant alpha may be treated as fixed or not (chosen by the data-driven algorithm). If alpha is fixed, its value must be specified in the alpha argument.

Some methods are available for objects of class "robustbetareg", see plot.robustbetareg, summary.robustbetareg, coef.robustbetareg, and residuals.robustbetareg, for details and other methods.

Value

robustbetareg returns an object of class "robustbetareg" with a list of the following components:

coefficients a list with the "mean" and "precision" coefficients.
vcov covariance matrix.
converged logical indicating successful convergence of the iterative process.
fitted.values a vector with the fitted values of the mean submodel.
start a vector with the starting values used in the iterative process.
weights the weights of each observation in the estimation process.
Tuning value of the tuning constant (automatically chosen or fixed) used in the estimation process.
residuals a vector of standardized weighted residual 2 (see Espinheira et al. (2008)).
n number of observations.
link link function used in the mean submodel.
link.phi link function used in the precision submodel.
Optimal.Tuning logical indicating whether the data-driven algorithm was used.
pseudo.r.squared pseudo R-squared value.
control the control arguments passed to the data-driven algorithm and optim call.
std.error the standard errors.
method type of estimator used.
call the original function call.
formula the formula used.
model the full model frame.
terms a list with elements "mean", "precision" and "full" containing the term objects for the respective models.
y the response variable.
data the dataset used.

Author(s)

Yuri S. Maluf (yurimaluf@gmail.com), Francisco F. Queiroz (ffelipeq@outlook.com) and Silvia L. P. Ferrari.

References

Maluf, Y.S., Ferrari, S.L.P., and Queiroz, F.F. (2022). Robust beta regression through the logit transformation. arXiv:2209.11315.

Ribeiro, T.K.A. and Ferrari, S.L.P. (2022). Robust estimation in beta regression via maximum Lq-likelihood. Statistical Papers. DOI: 10.1007/s00362-022-01320-0.

Ghosh, A. (2019). Robust inference under the beta regression model with application to health care studies. Statistical Methods in Medical Research, 28:271-888.

Espinheira, P.L., Ferrari, S.L.P., and Cribari-Neto, F. (2008). On beta regression residuals. Journal of Applied Statistics, 35:407–419.

See Also

robustbetareg.control, summary.robustbetareg, residuals.robustbetareg

Examples

#### Risk Manager Cost data
data("Firm")

# MLE fit (fixed alpha equal to zero)
fit_MLE <- robustbetareg(FIRMCOST ~ SIZELOG + INDCOST,
                         data = Firm, type = "LMDPDE", alpha = 0)
summary(fit_MLE)

# MDPDE with alpha = 0.04
fit_MDPDE <- robustbetareg(FIRMCOST ~ SIZELOG + INDCOST,
                           data = Firm, type = "MDPDE",
                           alpha = 0.04)
summary(fit_MDPDE)

# Choosing alpha via data-driven algorithm
fit_MDPDE2 <- robustbetareg(FIRMCOST ~ SIZELOG + INDCOST,
                            data = Firm, type = "MDPDE")
summary(fit_MDPDE2)

# Similar result for the LMDPDE fit:
fit_LMDPDE2 <- robustbetareg(FIRMCOST ~ SIZELOG + INDCOST,
                             data = Firm, type = "LMDPDE")
summary(fit_LMDPDE2)

# Diagnostic plots


#### HIC data
data("HIC")

# MLE (fixed alpha equal to zero)
fit_MLE <- robustbetareg(HIC ~ URB + GDP |
                         GDP, data = HIC, type = "LMDPDE",
                         alpha = 0)
summary(fit_MLE)

# SMLE and MDPDE with alpha selected via data-driven algorithm
fit_SMLE <- robustbetareg(HIC ~ URB + GDP |
                          GDP, data = HIC, type = "SMLE")
summary(fit_SMLE)
fit_MDPDE <- robustbetareg(HIC ~ URB + GDP |
                           GDP, data = HIC, type = "MDPDE")
summary(fit_MDPDE)
# SMLE and MDPDE return MLE because of the lack of stability

# LSMLE and LMDPDE with alpha selected via data-driven algorithm
fit_LSMLE <- robustbetareg(HIC ~ URB + GDP |
                           GDP, data = HIC, type = "LSMLE")
summary(fit_LSMLE)
fit_LMDPDE <- robustbetareg(HIC ~ URB + GDP |
                            GDP, data = HIC, type = "LMDPDE")
summary(fit_LMDPDE)
# LSMLE and LMDPDE return robust estimates with alpha = 0.06


# Plotting the weights against the residuals - LSMLE fit.
plot(fit_LSMLE$residuals, fit_LSMLE$weights, pch = "+", xlab = "Residuals",
    ylab = "Weights")

# Excluding outlier observation.
fit_LSMLEwo1 <- robustbetareg(HIC ~ URB + GDP |
                              GDP, data = HIC[-1,], type = "LSMLE")
summary(fit_LSMLEwo1)

# Normal probability plot with simulated envelope
 plotenvelope(fit_LSMLE)


[Package robustbetareg version 0.3.0 Index]