R: Post-hoc pooling

pooling {GAD}

R Documentation

Post-hoc pooling

Description

Performs a post-hoc pooling by combining or completely excluding terms from linear models

Usage

pooling(object, term = NULL, method = "pool", anova.tab = NULL)

Arguments

`object`	an object of class "`lm`", containing the specified design with random and/or fixed factors.
`term`	the term which will be removed from model.
`method`	method for removing a term from the model. Could be `method = "eliminate"` for completely exclude the term from model, or `method = "pool"` for pool the selected term with its appropriated F-ratio. Default to `method = "pool"`.
`anova.tab`	an object containing the results returned by `gad` or `pooling` functions. See examples below.

Details

Post-hoc pooling is a procedure to remove terms from a model. It might be done by several reasons: (i) lack of evidence against the null hypothesis of that term; (ii) a negative estimate of that term's component of variation (Fletcher and Underwood, 2002); (iii) the hypothesis of interest can not be tested until some terms are excluded from the model (Anderson et al., 2008). According to literature the term's p-value should exceed 0.25 before removing it (Underwood, 1997).

There are two different methods to remove a term from the model, determinated by method argument. When method = "eliminate" the chosen term is completely excluded from the model and its sum of squares and degrees of freedom are pooled with the residual sum of squares and degrees of freedom, as if the selected term had never been part of the model. When method = "pool" the chosen term's sum of squares and degrees of freedom are pooled with its denominator's sum of squares and degrees of freedom. The removal of terms using method = "pool" will be appropriated for most of situations (Anderson et al., 2008).

Note that removing a term has consequences for the construction of F-ratios (or quasi F-ratios), p-values and the estimation of components for the remaining terms, so should be done wisely. When there is more than one term which might be removed from the model (which p-value exceed 0.25), it is recommended to begin with the one having the smallest mean square (Anderson et. al, 2008).

Function pooling removes one term at once. After the removal of the term of interest, one should re-assess whether or not more terms should be removed. If it is the case, the output of pooling function should be stored in a new object and the function should be run again, using this new object in the data argument. This can be done successively. The way of pooling function does the analysis, step-by-step and storing the result in a new object at each step, gives the user total control of what happens and makes it easier return to the previous results.

Value

A list of length 4, containing the table of pooled terms ($pool.table), the mean squares estimates ($mse), the F-ratio versus ($f.versus) and the result of the analysis of variance ($anova).

Author(s)

Eliandro Gilbert (eliandrogilbert@gmail.com)

References

Anderson, M.J., Gorley, R.N., Clarke, K.R. 2008. PERMANOVA+ for PRIMER: Guide to Software and Statistical Methods. PRIMER-E: Plymouth, UK.

Fletcher, D.J., Underwood, A.J. 2002. How to cope with negative estimates of components of variance in ecological field studies. Journal of Experimental Marine Biology and Ecology 273, 89-95.

Underwood, A.J. 1997. Experiments in Ecology: Their Logical Design and Interpretation Using Analysis of Variance. Cambridge University Press, Cambridge.

Examples

library(GAD)
data(snails)
O <- as.random(snails$origin)   # a random factor
S <- as.random(snails$shore)    # a random factor orthogonal to origin
B <- as.random(snails$boulder)  # a random factor nested in shore
C <- as.random(snails$cage)     # a random factor nested in the combination of boulder and origin

model <- lm(growth ~ O + S + O*S + B%in%S + O*(B%in%S) + C%in%(O*(B%in%S)),data = snails)
estimates(model, quasi.f = FALSE)  # 'no test' for shore
gad(model, quasi.f = FALSE)        # no results for shore term
estimates(model, quasi.f = TRUE)   # suitable test for shore
gad(model, quasi.f = TRUE)         # test result for shore

# An alternative of using linear combinations of mean squares is the pooling function.
model.tab <- gad(model, quasi.f = FALSE) # stores the result of ANOVA on a new object
pooling(model, term = "S:B", method = "pool", anova.tab = model.tab)      # pooling terms
pooling(model, term = "S:B", method = "eliminate", anova.tab = model.tab) # or eliminating terms

[Package GAD version 2.0 Index]