AICcmodavg-package {AICcmodavg}R Documentation

Model Selection and Multimodel Inference Based on (Q)AIC(c)

Description

Description: This package includes functions to create model selection tables based on Akaike's information criterion (AIC) and the second-order AIC (AICc), as well as their quasi-likelihood counterparts (QAIC, QAICc). The package also features functions to conduct classic model averaging (multimodel inference) for a given parameter of interest or predicted values, as well as a shrinkage version of model averaging parameter estimates. Other handy functions enable the computation of relative variable importance, evidence ratios, and confidence sets for the best model. The present version supports Cox proportional hazards models and conditional logistic regression (coxph and coxme classes), linear models (lm class), generalized linear models (glm, glm.nb, vglm, hurdle, and zeroinfl classes), linear models fit by generalized least squares (gls class), linear mixed models (lme class), generalized linear mixed models (mer, merMod, and glmmTMB classes), multinomial and ordinal logistic regressions (multinom, polr, clm, and clmm classes), robust regression models (rlm class), beta regression models (betareg class), parametric survival models (survreg class), nonlinear models (nls and gnls classes), nonlinear mixed models (nlme and nlmerMod classes), univariate models (fitdist and fitdistr classes), and certain types of latent variable models (lavaan class). The package also supports various models of unmarkedFit and maxLikeFit classes estimating demographic parameters after accounting for imperfect detection probabilities. Some functions also allow the creation of model selection tables for Bayesian models of the bugs and rjags classes. Objects following model selection and multimodel inference can be formatted to LaTeX using xtable methods included in the package.

Details

Package: AICcmodavg
Type: Package
Version: 2.3-1
Date: 2020-08-21
License: GPL (>=2 )
LazyLoad: yes

Many functions of the package require a list of models as the input to conduct model selection and multimodel inference. Thus, you should start by organizing the output of the models in a list (See 'Examples' below).

This package contains several useful functions for model selection and multimodel inference for several model classes:

For models not yet supported by the functions above, the following can be useful for model selection and multimodel inference conducted from input values supplied by the user:

A number of functions for model diagnostics are available:

Other utility functions include:

Author(s)

Marc J. Mazerolle <marc.mazerolle@uqat.ca>.

References

Anderson, D. R. (2008) Model-based inference in the life sciences: a primer on evidence. Springer: New York.

Burnham, K. P., and Anderson, D. R. (2002) Model selection and multimodel inference: a practical information-theoretic approach. Second edition. Springer: New York.

Burnham, K. P., Anderson, D. R. (2004) Multimodel inference: understanding AIC and BIC in model selection. Sociological Methods and Research 33, 261–304.

Mazerolle, M. J. (2006) Improving data analysis in herpetology: using Akaike's Information Criterion (AIC) to assess the strength of biological hypotheses. Amphibia-Reptilia 27, 169–180.

Examples

##Example 1:  Poisson GLM with offset
##anuran larvae example from Mazerolle (2006) 
data(min.trap)
##assign "UPLAND" as the reference level as in Mazerolle (2006)          
min.trap$Type <- relevel(min.trap$Type, ref = "UPLAND") 

##set up candidate models in a list
Cand.mod <- list()
##global model          
Cand.mod[[1]] <- glm(Num_anura ~ Type + log.Perimeter + Num_ranatra,
                     family = poisson, offset = log(Effort),
                     data = min.trap) 
Cand.mod[[2]] <- glm(Num_anura ~ Type + log.Perimeter, family = poisson,
                     offset = log(Effort), data = min.trap) 
Cand.mod[[3]] <- glm(Num_anura ~ Type + Num_ranatra, family = poisson,
                     offset = log(Effort), data = min.trap) 
Cand.mod[[4]] <- glm(Num_anura ~ Type, family = poisson,
                     offset = log(Effort), data = min.trap) 
Cand.mod[[5]] <- glm(Num_anura ~ log.Perimeter + Num_ranatra,
                     family = poisson, offset = log(Effort),
                     data = min.trap) 
Cand.mod[[6]] <- glm(Num_anura ~ log.Perimeter, family = poisson,
                     offset = log(Effort), data = min.trap) 
Cand.mod[[7]] <- glm(Num_anura ~ Num_ranatra, family = poisson,
                     offset = log(Effort), data = min.trap) 
Cand.mod[[8]] <- glm(Num_anura ~ 1, family = poisson,
                     offset = log(Effort), data = min.trap) 
          
##check c-hat for global model
c_hat(Cand.mod[[1]], method = "pearson") #uses Pearson's chi-square/df
##note the very low overdispersion: in this case, the analysis could be
##conducted without correcting for c-hat as its value is reasonably close
##to 1  

##output of model corrected for overdispersion
summaryOD(Cand.mod[[1]], c.hat = 1.04)

##assign names to each model
Modnames <- c("type + logperim + invertpred", "type + logperim",
              "type + invertpred", "type", "logperim + invertpred",
              "logperim", "invertpred", "intercept only") 

##model selection table based on AICc
aictab(cand.set = Cand.mod, modnames = Modnames)

##compute evidence ratio
evidence(aictab(cand.set = Cand.mod, modnames = Modnames))

##compute confidence set based on 'raw' method
confset(cand.set = Cand.mod, modnames = Modnames, second.ord = TRUE,
        method = "raw")  

##compute importance value for "TypeBOG" - same number of models
##with vs without variable
importance(cand.set = Cand.mod, modnames = Modnames, parm = "TypeBOG") 

##compute model-averaged estimate of "TypeBOG" using the natural average
modavg(cand.set = Cand.mod, modnames = Modnames, parm = "TypeBOG")

##compute model-averaged estimate of "TypeBOG" using shrinkage estimator
##same number of models with vs without variable
modavgShrink(cand.set = Cand.mod, modnames = Modnames,
             parm = "TypeBOG")

##compute model-averaged predictions for two types of ponds
##create a data set for predictions
dat.pred <- data.frame(Type = factor(c("BOG", "UPLAND")),
                       log.Perimeter = mean(min.trap$log.Perimeter),
                       Num_ranatra = mean(min.trap$Num_ranatra),
                       Effort = mean(min.trap$Effort))

##model-averaged predictions across entire model set
modavgPred(cand.set = Cand.mod, modnames = Modnames,
           newdata = dat.pred, type = "response")

##compute model-averaged effect size between two groups
##'newdata' data frame must be limited to two rows
modavgEffect(cand.set = Cand.mod, modnames = Modnames,
             newdata = dat.pred, type = "link")


## Not run: 
##Example 2:  single-season occupancy model example modified from ?occu
require(unmarked)
##single season
data(frogs)
pferUMF <- unmarkedFrameOccu(pfer.bin)
## add some fake covariates for illustration
siteCovs(pferUMF) <- data.frame(sitevar1 = rnorm(numSites(pferUMF)),
                                sitevar2 = rnorm(numSites(pferUMF))) 
     
## observation covariates are in site-major, observation-minor order
obsCovs(pferUMF) <- data.frame(obsvar1 = rnorm(numSites(pferUMF) *
                                 obsNum(pferUMF))) 

##check detection history data from data object
detHist(pferUMF)

##set up candidate model set
fm1 <- occu(~ obsvar1 ~ sitevar1, pferUMF)
##check detection history data from model object
detHist(fm1)

fm2 <- occu(~ 1 ~ sitevar1, pferUMF)
fm3 <- occu(~ obsvar1 ~ sitevar2, pferUMF)
fm4 <- occu(~ 1 ~ sitevar2, pferUMF)
Cand.models <- list(fm1, fm2, fm3, fm4)

##assign names to elements in list
##alternative to using 'modnames' argument
names(Cand.models) <- c("fm1", "fm2", "fm3", "fm4")

##check GOF of global model and estimate c-hat
mb.gof.test(fm4, nsim = 100) #nsim should be > 1000

##check for high SE's in models
lapply(Cand.models, checkParms, simplify = FALSE)

##compute table
print(aictab(cand.set = Cand.models,
             second.ord = TRUE), digits = 4)

##export as LaTeX table
if(require(xtable)) {
xtable(aictab(cand.set = Cand.models,
              second.ord = TRUE))
}

##compute evidence ratio
evidence(aictab(cand.set = Cand.models))
##evidence ratio between top model vs lowest-ranked model
evidence(aictab(cand.set = Cand.models), model.high = "fm2", model.low = "fm3")

##compute confidence set based on 'raw' method
confset(cand.set = Cand.models, second.ord = TRUE,
        method = "raw")  

##compute importance value for "sitevar1" on occupancy
##same number of models with vs without variable
importance(cand.set = Cand.models, parm = "sitevar1",
           parm.type = "psi") 

##compute model-averaged estimate of "sitevar1" on occupancy
##(natural average)
modavg(cand.set = Cand.models, parm = "sitevar1",
       parm.type = "psi")

##compute model-averaged estimate of "sitevar1"
##(shrinkage estimator)
##same number of models with vs without variable
modavgShrink(cand.set = Cand.models,
             parm = "sitevar1", parm.type = "psi")

##compute model-average predictions
##check explanatory variables appearing in models
extractX(Cand.models, parm.type = "psi")

##create a data set for predictions
dat.pred <- data.frame(sitevar1 = seq(from = min(siteCovs(pferUMF)$sitevar1),
                         to = max(siteCovs(pferUMF)$sitevar1), by = 0.5),
                       sitevar2 = mean(siteCovs(pferUMF)$sitevar2))

##model-averaged predictions of psi across range of values
##of sitevar1 and entire model set
modavgPred(cand.set = Cand.models, newdata = dat.pred,
           parm.type = "psi")
detach(package:unmarked)

## End(Not run)

## Not run: 
##Example 3:  example with user-supplied values of log-likelihoods and
##number of parameters

##vector with model LL's
LL <- c(-38.8876, -35.1783, -64.8970)
     
##vector with number of parameters
Ks <- c(7, 9, 4)
     
##create a vector of names to trace back models in set
Modnames <- c("Cm1", "Cm2", "Cm3")

##generate AICc table
aictabCustom(logL = LL, K = Ks, modnames = Modnames, nobs = 121,
             sort = TRUE) 
##generate AIC table
aictabCustom(logL = LL, K = Ks, modnames = Modnames,
             second.ord = FALSE, nobs = 121, sort = TRUE)

##model averaging parameter estimate
##vector of beta estimates for a parameter of interest
model.ests <- c(0.0478, 0.0480, 0.0478)
     
##vector of SE's of beta estimates for a parameter of interest
model.se.ests <- c(0.0028, 0.0028, 0.0034)
     
##compute model-averaged estimate and unconditional SE based on AICc
modavgCustom(logL = LL, K = Ks, modnames = Modnames, 
             estimate = model.ests, se = model.se.ests, nobs = 121)
##compute model-averaged estimate and unconditional SE based on BIC
modavgCustom(logL = LL, K = Ks, modnames = Modnames, 
             estimate = model.ests, se = model.se.ests, nobs = 121,
             useBIC = TRUE)

## End(Not run)

## Not run: 
##Example 4:  example with user-supplied values of information criterion
##model selection based on WAIC

##WAIC values
waic <- c(105.74, 107.36, 108.24, 100.57)
##number of effective parameters
effK <- c(7.45, 5.61, 6.14, 6.05)

##create a vector of names to trace back models in set
Modnames <- c("global model", "interactive model",
              "additive model", "invertpred model")

##generate WAIC model selection table
ictab(ic = waic, K = effK, modnames = Modnames,
      sort = TRUE, ic.name = "WAIC")

##compute model-averaged estimate
##vector of predictions
Preds <- c(0.106, 0.137, 0.067, 0.050)
##vector of SE's for prediction
Ses <- c(0.128, 0.159, 0.054, 0.039)

##compute model-averaged estimate and unconditional SE based on WAIC
modavgIC(ic = waic, K = effK, modnames = Modnames, 
         estimate = Preds, se = Ses,
         ic.name = "WAIC")

##export as LaTeX table
if(require(xtable)) {
##model-averaged estimate and confidence interval
xtable(modavgIC(ic = waic, K = effK, modnames = Modnames, 
       estimate = Preds, se = Ses,
       ic.name = "WAIC"))
##model selection table with estimate and SE's from each model
xtable(modavgIC(ic = waic, K = effK, modnames = Modnames, 
       estimate = Preds, se = Ses,
       ic.name = "WAIC"), print.table = TRUE)
}

## End(Not run)

[Package AICcmodavg version 2.3-1 Index]