plotAovDesign {EnvStats}  R Documentation 
Create plots involving sample size, power, scaled difference, and significance level for a oneway fixedeffects analysis of variance.
plotAovDesign(x.var = "n", y.var = "power", range.x.var = NULL,
n.vec = c(25, 25), mu.vec = c(0, 1), sigma = 1, alpha = 0.05, power = 0.95,
round.up = FALSE, n.max = 5000, tol = 1e07, maxiter = 1000, plot.it = TRUE,
add = FALSE, n.points = 50, plot.col = 1, plot.lwd = 3 * par("cex"),
plot.lty = 1, digits = .Options$digits, main = NULL, xlab = NULL, ylab = NULL,
type = "l", ...)
x.var 
character string indicating what variable to use for the xaxis. Possible values are

y.var 
character string indicating what variable to use for the yaxis. Possible values are

range.x.var 
numeric vector of length 2 indicating the range of the xvariable to use for the plot.
The default value depends on the value of 
n.vec 
numeric vector indicating the sample size for each group. The default value is

mu.vec 
numeric vector indicating the population mean for each group. The default value is

sigma 
numeric scalar indicating the population standard deviation for all groups. The default
value is 
alpha 
numeric scalar between 0 and 1 indicating the Type I error level associated with the
hypothesis test. The default value is 
power 
numeric scalar between 0 and 1 indicating the power associated with the hypothesis
test. The default value is 
round.up 
logical scalar indicating whether to round up the values of the computed sample
size(s) to the next smallest integer. The default value is FALSE. This argument
is ignored unless 
n.max 
for the case when 
tol 
for the case when 
maxiter 
for the case when 
plot.it 
a logical scalar indicating whether to create a plot or add to the existing plot
(see 
add 
a logical scalar indicating whether to add the design plot to the existing plot
( 
n.points 
a numeric scalar specifying how many (x,y) pairs to use to produce the plot. There are

plot.col 
a numeric scalar or character string determining the color of the plotted line or points. The default value
is 
plot.lwd 
a numeric scalar determining the width of the plotted line. The default value is

plot.lty 
a numeric scalar determining the line type of the plotted line. The default value is

digits 
a scalar indicating how many significant digits to print out on the plot. The default
value is the current setting of 
main , xlab , ylab , type , ... 
additional graphical parameters (see 
See the help files for aovPower
and aovN
for information on how to compute the power and sample size for a
oneway fixedeffects analysis of variance.
plotAovDesign
invisibly returns a list with components:
x.var 
xcoordinates of the points that have been or would have been plotted 
y.var 
ycoordinates of the points that have been or would have been plotted 
The normal and lognormal distribution are probably the two most frequently used distributions to model environmental data. Sometimes it is necessary to compare several means to determine whether any are significantly different from each other (e.g., USEPA, 2009, p.638). In this case, assuming normally distributed data, you perform a oneway parametric analysis of variance.
In the course of designing a sampling program, an environmental
scientist may wish to determine the relationship between sample
size, Type I error level, power, and differences in means if
one of the objectives of the sampling program is to determine
whether a particular mean differs from a group of means. The
functions aovPower
, aovN
, and
plotAovDesign
can be used to investigate these
relationships for the case of normallydistributed observations.
Steven P. Millard (EnvStats@ProbStatInfo.com)
Berthouex, P.M., and L.C. Brown. (1994). Statistics for Environmental Engineers. Lewis Publishers, Boca Raton, FL, Chapter 17.
Helsel, D.R., and R.M. Hirsch. (1992). Statistical Methods in Water Resources Research. Elsevier, New York, NY, Chapter 7.
Johnson, N. L., S. Kotz, and N. Balakrishnan. (1995). Continuous Univariate Distributions, Volume 2. Second Edition. John Wiley and Sons, New York, Chapters 27, 29, 30.
Scheffe, H. (1959). The Analysis of Variance. John Wiley and Sons, New York, 477pp.
USEPA. (2009). Statistical Analysis of Groundwater Monitoring Data at RCRA Facilities, Unified Guidance. EPA 530/R09007, March 2009. Office of Resource Conservation and Recovery Program Implementation and Information Division. U.S. Environmental Protection Agency, Washington, D.C.
Zar, J.H. (2010). Biostatistical Analysis. Fifth Edition. PrenticeHall, Upper Saddle River, NJ, Chapter 10.
# Look at the relationship between power and sample size
# for a oneway ANOVA, assuming k=2 groups, group means of
# 0 and 1, a population standard deviation of 1, and a
# 5% significance level:
dev.new()
plotAovDesign()
#
# Plot power vs. sample size for various levels of significance:
dev.new()
plotAovDesign(mu.vec = c(0, 0.5, 1), ylim=c(0, 1), main="")
plotAovDesign(mu.vec = c(0, 0.5, 1), alpha=0.1, add=TRUE, plot.col=2)
plotAovDesign(mu.vec = c(0, 0.5, 1), alpha=0.2, add=TRUE, plot.col=3)
legend(35, 0.6, c("20%", "10%", " 5%"), lty=1, lwd = 3, col=3:1,
bty = "n")
mtext("Power vs. Sample Size for OneWay ANOVA", line = 3, cex = 1.25)
mtext(expression(paste("with ", mu, "=(0, 0.5, 1), ", sigma,
"=1, and Various Significance Levels", sep="")),
line = 1.5, cex = 1.25)
#
# The example on pages 511 to 514 of USEPA (1989b) shows
# logtransformed concentrations of lead (mg/L) at two
# background wells and four compliance wells, where
# observations were taken once per month over four months
# (the data are stored in EPA.89b.loglead.df).
# Assume the true mean levels at each well are
# 3.9, 3.9, 4.5, 4.5, 4.5, and 5, respectively. Plot the
# power vs. sample size of a oneway ANOVA to test for mean
# differences between wells. Use alpha=0.05, and assume the
# true standard deviation is equal to the one estimated
# from the data in this example.
names(EPA.89b.loglead.df)
#[1] "LogLead" "Month" "Well" "Well.type"
# Perform the ANOVA and get the estimated sd
aov.list < aov(LogLead ~ Well, data=EPA.89b.loglead.df)
summary(aov.list)
# Df Sum Sq Mean Sq F value Pr(>F)
#Well 5 5.7447 1.14895 3.3469 0.02599 *
#Residuals 18 6.1791 0.34328
#
#Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 '' 1
# Now create the plot
dev.new()
plotAovDesign(range.x.var = c(2, 20),
mu.vec = c(3.9,3.9,4.5,4.5,4.5,5),
sigma=sqrt(0.34),
ylim = c(0, 1), digits=2)
# Clean up
#
rm(aov.list)
graphics.off()