gofGroupTest {EnvStats}  R Documentation 
GoodnessofFit Test for a Specified Probability Distribution for Groups
Description
Perform a goodnessoffit test to determine whether data in a set of groups appear to all come from the same probability distribution (with possibly different parameters for each group).
Usage
gofGroupTest(object, ...)
## S3 method for class 'formula'
gofGroupTest(object, data = NULL, subset,
na.action = na.pass, ...)
## Default S3 method:
gofGroupTest(object, group, test = "sw",
distribution = "norm", est.arg.list = NULL, n.classes = NULL,
cut.points = NULL, param.list = NULL,
estimate.params = ifelse(is.null(param.list), TRUE, FALSE),
n.param.est = NULL, correct = NULL, digits = .Options$digits,
exact = NULL, ws.method = "normal scores",
data.name = NULL, group.name = NULL, parent.of.data = NULL,
subset.expression = NULL, ...)
## S3 method for class 'data.frame'
gofGroupTest(object, ...)
## S3 method for class 'matrix'
gofGroupTest(object, ...)
## S3 method for class 'list'
gofGroupTest(object, ...)
Arguments
object 
an object containing data for 2 or more groups to be compared to the
hypothesized distribution specified by 
data 
when 
subset 
when 
na.action 
when 
group 
when 
test 
character string defining which goodnessoffit test to perform on each group.
Possible values are:

distribution 
a character string denoting the distribution abbreviation. See the help file for
When When When When When 
est.arg.list 
a list of arguments to be passed to the function estimating the distribution parameters
for each group of observations.
For example, if When When When When 
n.classes 
for the case when 
cut.points 
for the case when 
param.list 
for the case when 
estimate.params 
for the case when 
n.param.est 
for the case when 
correct 
for the case when 
digits 
a scalar indicating how many significant digits to print out for the parameters
associated with the hypothesized distribution. The default value is 
exact 
for the case when 
ws.method 
character string indicating which method to use when performing the
WilkShapiro test for a Uniform [0,1] distribution
on the pvalues from the goodnessoffit tests on each group. Possible values
are NOTE: In the case where you are testing whether each group comes from a
Uniform [0,1] distribution (i.e., when you set

data.name 
character string indicating the name of the data used for the goodnessoffit tests.
The default value is 
group.name 
character string indicating the name of the data used to create the groups.
The default value is 
parent.of.data 
character string indicating the source of the data used for the goodnessoffit tests. 
subset.expression 
character string indicating the expression used to subset the data. 
... 
additional arguments affecting the goodnessoffit test. 
Details
The function gofGroupTest
performs a goodnessoffit test for each group of
data by calling the function gofTest
. Using the pvalues from these
goodnessoffit tests, it then calls the function gofTest
with the
argument test="ws"
to test whether the pvalues appear to come from a
Uniform [0,1] distribution.
Value
a list of class "gofGroup"
containing the results of the group goodnessoffit test.
Objects of class "gofGroup"
have special printing and plotting methods.
See the help file for gofGroup.object
for details.
Note
The WilkShapiro (1968) tests for a Uniform [0, 1] distribution were introduced in the context
of testing whether several independent samples all come from normal distributions, with
possibly different means and variances. The function gofGroupTest
extends
this idea to allow you to test whether several independent samples come from the same
distribution (e.g., gamma, extreme value, etc.), with possibly different parameters.
Examples of simultaneously assessing whether several groups come from the same distribution are given in USEPA (2009) and Gibbons et al. (2009).
In practice, almost any goodnessoffit test will not reject the null hypothesis
if the number of observations is relatively small. Conversely, almost any goodnessoffit
test will reject the null hypothesis if the number of observations is very large,
since “real” data are never distributed according to any theoretical distribution
(Conover, 1980, p.367). For most cases, however, the distribution of “real” data
is close enough to some theoretical distribution that fairly accurate results may be
provided by assuming that particular theoretical distribution. One way to asses the
goodness of the fit is to use goodnessoffit tests. Another way is to look at
quantilequantile (QQ) plots (see qqPlot
).
Author(s)
Steven P. Millard (EnvStats@ProbStatInfo.com)
References
Gibbons, R.D., D.K. Bhaumik, and S. Aryal. (2009). Statistical Methods for Groundwater Monitoring, Second Edition. John Wiley & Sons, Hoboken.
USEPA. (2009). Statistical Analysis of Groundwater Monitoring Data at RCRA Facilities, Unified Guidance. EPA 530/R09007, March 2009. Office of Resource Conservation and Recovery Program Implementation and Information Division. U.S. Environmental Protection Agency, Washington, D.C. p.1717.
USEPA. (2010). Errata Sheet  March 2009 Unified Guidance. EPA 530/R09007a, August 9, 2010. Office of Resource Conservation and Recovery, Program Information and Implementation Division. U.S. Environmental Protection Agency, Washington, D.C.
Wilk, M.B., and S.S. Shapiro. (1968). The Joint Assessment of Normality of Several Independent Samples. Technometrics, 10(4), 825839.
See Also
gofTest
, gofGroup.object
, print.gofGroup
,
plot.gofGroup
, qqPlot
.
Examples
# Example 104 of USEPA (2009, page 1020) gives an example of
# simultaneously testing the assumption of normality for nickel
# concentrations (ppb) in groundwater collected at 4 monitoring
# wells over 5 months. The data for this example are stored in
# EPA.09.Ex.10.1.nickel.df.
EPA.09.Ex.10.1.nickel.df
# Month Well Nickel.ppb
#1 1 Well.1 58.8
#2 3 Well.1 1.0
#3 6 Well.1 262.0
#4 8 Well.1 56.0
#5 10 Well.1 8.7
#6 1 Well.2 19.0
#7 3 Well.2 81.5
#8 6 Well.2 331.0
#9 8 Well.2 14.0
#10 10 Well.2 64.4
#11 1 Well.3 39.0
#12 3 Well.3 151.0
#13 6 Well.3 27.0
#14 8 Well.3 21.4
#15 10 Well.3 578.0
#16 1 Well.4 3.1
#17 3 Well.4 942.0
#18 6 Well.4 85.6
#19 8 Well.4 10.0
#20 10 Well.4 637.0
# Test for a normal distribution at each well:
#
gofGroup.list < gofGroupTest(Nickel.ppb ~ Well,
data = EPA.09.Ex.10.1.nickel.df)
gofGroup.list
#Results of Group GoodnessofFit Test
#
#
#Test Method: WilkShapiro GOF (Normal Scores)
#
#Hypothesized Distribution: Normal
#
#Data: Nickel.ppb
#
#Grouping Variable: Well
#
#Data Source: EPA.09.Ex.10.1.nickel.df
#
#Number of Groups: 4
#
#Sample Sizes: Well.1 = 5
# Well.2 = 5
# Well.3 = 5
# Well.4 = 5
#
#Test Statistic: z (G) = 3.658696
#
#Pvalues for
#Individual Tests: Well.1 = 0.03510747
# Well.2 = 0.02385344
# Well.3 = 0.01120775
# Well.4 = 0.10681461
#
#Pvalue for
#Group Test: 0.0001267509
#
#Alternative Hypothesis: At least one group
# does not come from a
# Normal Distribution.
dev.new()
plot(gofGroup.list)
#
# Test for a lognormal distribution at each well:
#
gofGroupTest(Nickel.ppb ~ Well, data = EPA.09.Ex.10.1.nickel.df,
dist = "lnorm")
#Results of Group GoodnessofFit Test
#
#
#Test Method: WilkShapiro GOF (Normal Scores)
#
#Hypothesized Distribution: Lognormal
#
#Data: Nickel.ppb
#
#Grouping Variable: Well
#
#Data Source: EPA.09.Ex.10.1.nickel.df
#
#Number of Groups: 4
#
#Sample Sizes: Well.1 = 5
# Well.2 = 5
# Well.3 = 5
# Well.4 = 5
#
#Test Statistic: z (G) = 0.2401720
#
#Pvalues for
#Individual Tests: Well.1 = 0.6898164
# Well.2 = 0.6700394
# Well.3 = 0.3208299
# Well.4 = 0.5041375
#
#Pvalue for
#Group Test: 0.5949015
#
#Alternative Hypothesis: At least one group
# does not come from a
# Lognormal Distribution.
#
# Clean up
rm(gofGroup.list)
graphics.off()