PseudoR2 {DescTools} | R Documentation |
Pseudo R2 Statistics
Description
Although there's no commonly accepted agreement on how to assess the fit of a logistic regression, there are some approaches. The goodness of fit of the logistic regression model can be expressed by some variants of pseudo R squared statistics, most of which being based on the deviance of the model.
Usage
PseudoR2(x, which = NULL)
Arguments
x |
the |
which |
character, one out of |
Details
Cox and Snell's R^2
is based on the log likelihood for the model compared to the log likelihood for a baseline model. However, with categorical outcomes, it has a theoretical maximum value of less than 1, even for a "perfect" model.
Nagelkerke's R^2
(also sometimes called Cragg-Uhler) is an adjusted version of the Cox and Snell's R^2
that adjusts the scale of the statistic to cover the full range from 0 to 1.
McFadden's R^2
is another version, based on the log-likelihood kernels for the intercept-only model and the full estimated model.
Veall and Zimmermann concluded that from a set of six widely used measures the measure suggested by McKelvey and Zavoina had the closest correspondance to ordinary least square R2. The Aldrich-Nelson pseudo-R2 with the Veall-Zimmermann correction is the best approximation of the McKelvey-Zavoina pseudo-R2. Efron, Aldrich-Nelson, McFadden and Nagelkerke approaches severely underestimate the "true R2".
Value
the value of the specific statistic. AIC
, LogLik
, LogLikNull
and G2
will only be reported with option "all"
.
McFadden |
McFadden pseudo- |
McFaddenAdj |
McFadden adjusted pseudo- |
CoxSnell |
Cox and Snell pseudo- |
Nagelkerke |
Nagelkerke pseudo |
AldrichNelson |
AldrichNelson pseudo- |
VeallZimmermann |
VeallZimmermann pseudo- |
McKelveyZavoina |
McKelvey and Zavoina pseudo- |
Efron |
Efron pseudo- |
Tjur |
Tjur's pseudo- |
AIC |
Akaike's information criterion |
LogLik |
log-Likelihood for the fitted model (by maximum likelihood) |
LogLikNull |
log-Likelihood for the null model. The null model will include the offset, and an intercept if there is one in the model. |
G2 |
differenz of the null deviance - model deviance |
Author(s)
Andri Signorell <andri@signorell.net> with contributions of Ben Mainwaring <benjamin.mainwaring@yougov.com> and Daniel Wollschlaeger
References
Aldrich, J. H. and Nelson, F. D. (1984): Linear Probability, Logit, and probit Models, Sage University Press, Beverly Hills.
Cox D R & Snell E J (1989) The Analysis of Binary Data 2nd ed. London: Chapman and Hall.
Efron, B. (1978). Regression and ANOVA with zero-one data: Measures of residual variation. Journal of the American Statistical Association, 73(361), 113–121.
Hosmer, D. W., & Lemeshow, S. (2000). Applied logistic regression (2nd ed.). Hoboke, NJ: Wiley.
McFadden D (1979). Quantitative methods for analysing travel behavior of individuals: Some recent developments. In D. A. Hensher & P. R. Stopher (Eds.), Behavioural travel modelling (pp. 279-318). London: Croom Helm.
McKelvey, R. D., & Zavoina, W. (1975). A statistical model for the analysis of ordinal level dependent variables. The Journal of Mathematical Sociology, 4(1), 103–120
Nagelkerke, N. J. D. (1991). A note on a general definition of the coefficient of determination. Biometrika, 78(3), 691–692.
Tjur, T. (2009) Coefficients of determination in logistic regression models - a new proposal: The coefficient of discrimination. The American Statistician, 63(4): 366-372
Veall, M.R., & Zimmermann, K.F. (1992) Evalutating Pseudo-R2's fpr binary probit models. Quality&Quantity, 28, pp. 151-164
See Also
Examples
r.glm <- glm(Survived ~ ., data=Untable(Titanic), family=binomial)
PseudoR2(r.glm)
PseudoR2(r.glm, c("McFadden", "Nagel"))