R: Dominance analysis for OLS (univariate and multivariate), GLM...

dominanceAnalysis {dominanceanalysis}

R Documentation

Dominance analysis for OLS (univariate and multivariate), GLM and LMM models

Description

Dominance analysis for OLS (univariate and multivariate), GLM and LMM models

Usage

dominanceAnalysis(
  x,
  constants = c(),
  terms = NULL,
  fit.functions = "default",
  newdata = NULL,
  null.model = NULL,
  ...
)

Arguments

`x`	fitted model (lm, glm, betareg), lmWithCov or mlmWithCov object
`constants`	vector of predictors to remain unchanged between models
`terms`	vector of terms to be analyzed. By default, obtained from the model
`fit.functions`	Name of the method used to provide fit indices
`newdata`	optional data.frame, that update data used on original model
`null.model`	for mixed models, null model against to test the submodels
`...`	Other arguments provided to lm or lmer (not implemented yet)

Value

`predictors`	Vector of predictors.
`constants`	Vector of constant variables.
`terms`	Vector of terms to be analyzed.
`fit.functions`	Vector of fit indices names.
`fits`	List with raw fits indices. See `daRawResults`.
`contribution.by.level`	List of mean contribution of each predictor by level for each fit index. Each element is a data.frame, with levels as rows and predictors as columns, for each fit index.
`contribution.average`	List with mean contribution of each predictor for all levels. These values are obtained for every fit index considered in the analysis. Each element is a vector of mean contributions for a given fit index.
`complete`	Matrix for complete dominance.
`conditional`	Matrix for conditional dominance.
`general`	Matrix for general dominance.

Definition of Dominance Analysis

Budescu (1993) developed a clear and intuitive definition of importance in regression models, that states that a predictor's importance reflects its contribution in the prediction of the criterion and that one predictor is 'more important than another' if it contributes more to the prediction of the criterion than does its competitor at a given level of analysis.

Types of dominance

The original paper (Bodescu, 1993) defines that variable X_1 dominates X_2 when X_1 is chosen over X_2 in all possible subset of models where only one of these two predictors is to be entered. Later, Azen & Bodescu (2003), name the previously definition as 'complete dominance' and two other types of dominance: conditional and general dominance. Conditional dominance is calculated as the average of the additional contributions to all subset of models of a given model size. General dominance is calculated as the mean of average contribution on each level.

Fit indices availables

To obtain the fit-indices for each model, a function called da.<model>.fit is executed. For example, for a lm model, function da.lm.fit provides R^2 values. Currently, seven models are implemented:

lm: Provides R^2 or coefficient of determination. See da.lm.fit
glm: Provides four fit indices recommended by Azen & Traxel (2009): Cox and Snell(1989), McFadden (1974), Nagelkerke (1991), and Estrella (1998). See da.glm.fit
lmerMod: Provides four fit indices recommended by Lou & Azen (2012). See da.lmerMod.fit
lmWithCov: Provides R^2 for a correlation/covariance matrix. See lmWithCov to create the model and da.lmWithCov.fit for the fit index function.
mlmWithCov: Provides both R^2_{XY} and P^2_{XY} for multivariate regression models using a correlation/covariance matrix. See mlmWithCov to create the model and da.mlmWithCov.fit for the fit index function
dynlm: Provides R^2 for dynamic linear models. There is no literature reference about using dominance analysis on dynamic linear models, so you're warned!. See da.dynlm.fit

betareg: Provides pseudo-R^2, Cox and Snell(1989), McFadden (1974), and Estrella (1998). You could set the link function using link.betareg if automatic detection of link function doesn't work.

See da.betareg.fit

References

Azen, R., & Budescu, D. V. (2003). The dominance analysis approach for comparing predictors in multiple regression. Psychological Methods, 8(2), 129-148. doi:10.1037/1082-989X.8.2.129
Azen, R., & Budescu, D. V. (2006). Comparing Predictors in Multivariate Regression Models: An Extension of Dominance Analysis. Journal of Educational and Behavioral Statistics, 31(2), 157-180. doi:10.3102/10769986031002157
Azen, R., & Traxel, N. (2009). Using Dominance Analysis to Determine Predictor Importance in Logistic Regression. Journal of Educational and Behavioral Statistics, 34(3), 319-347. doi:10.3102/1076998609332754
Budescu, D. V. (1993). Dominance analysis: A new approach to the problem of relative importance of predictors in multiple regression. Psychological Bulletin, 114(3), 542-551. doi:10.1037/0033-2909.114.3.542
Luo, W., & Azen, R. (2012). Determining Predictor Importance in Hierarchical Linear Models Using Dominance Analysis. Journal of Educational and Behavioral Statistics, 38(1), 3-31. doi:10.3102/1076998612458319

Examples

data(longley)
lm.1<-lm(Employed~.,longley)
da<-dominanceAnalysis(lm.1)
print(da)
summary(da)
plot(da,which.graph='complete')
plot(da,which.graph='conditional')
plot(da,which.graph='general')

# Maintaining year as a constant on all submodels
da.no.year<-dominanceAnalysis(lm.1,constants='Year')
print(da.no.year)
summary(da.no.year)
plot(da.no.year,which.graph='complete')

# Parameter terms could be used to group variables
da.terms=c(GNP.rel='GNP.deflator+GNP',
           pop.rel='Unemployed+Armed.Forces+Population+Unemployed',
           year='Year')
da.grouped<-dominanceAnalysis(lm.1,terms=da.terms)
print(da.grouped)
summary(da.grouped)
plot(da.grouped, which.graph='complete')

[Package dominanceanalysis version 2.1.0 Index]