anderson_darling {cmstatr} | R Documentation |
Anderson–Darling test for goodness of fit
Description
Calculates the Anderson–Darling test statistic for a sample given a particular distribution, and determines whether to reject the hypothesis that a sample is drawn from that distribution.
Usage
anderson_darling_normal(data = NULL, x, alpha = 0.05)
anderson_darling_lognormal(data = NULL, x, alpha = 0.05)
anderson_darling_weibull(data = NULL, x, alpha = 0.05)
Arguments
data |
a data.frame-like object (optional) |
x |
a numeric vector or a variable in the data.frame |
alpha |
the required significance level of the test. Defaults to 0.05. |
Details
The Anderson–Darling test statistic is calculated for the distribution given by the user.
The observed significance level (OSL), or p-value, is calculated assuming that the parameters of the distribution are unknown; these parameters are estimate from the data.
The function anderson_darling_normal
computes the Anderson–Darling
test statistic given a normal distribution with mean and standard deviation
equal to the sample mean and standard deviation.
The function anderson_darling_lognormal
is the same as
anderson_darling_normal
except that the data is log transformed
first.
The function anderson_darling_weibull
computes the Anderson–Darling
test statistic given a Weibull distribution with shape and scale parameters
estimated from the data using a maximum likelihood estimate.
The test statistic, A
, is modified to account for
the fact that the parameters of the population are not known,
but are instead estimated from the sample. This modification is
a function of the sample size only, and is different for each
distribution (normal/lognormal or Weibull). Several such modifications
have been proposed. This function uses the modification published in
Stephens (1974), Lawless (1982) and CMH-17-1G. Some other implementations
of the Anderson-Darling test, such as the implementation in the
nortest
package, use other modifications, such as the one
published in D'Agostino and Stephens (1986). As such, the p-value
reported by this function may differ from the p-value reported
by implementations of the Anderson–Darling test that use
different modifiers. Only the unmodified
test statistic is reported in the result of this function, but
the modified test statistic is used to compute the OSL (p-value).
This function uses the formulae for observed significance level (OSL) published in CMH-17-1G. These formulae depend on the particular distribution used.
The results of this function have been validated against published values in Lawless (1982).
Value
an object of class anderson_darling
. This object has the following
fields.
-
call
the expression used to call this function -
dist
the distribution used -
data
a copy of the data analyzed -
n
the number of observations in the sample -
A
the Anderson–Darling test statistic -
osl
the observed significance level (p-value), assuming the parameters of the distribution are estimated from the data -
alpha
the required significance level for the test. This value is given by the user. -
reject_distribution
a logical value indicating whether the hypothesis that the data is drawn from the specified distribution should be rejected
References
J. F. Lawless, Statistical models and methods for lifetime data. New York: Wiley, 1982.
"Composite Materials Handbook, Volume 1. Polymer Matrix Composites Guideline for Characterization of Structural Materials," SAE International, CMH-17-1G, Mar. 2012.
M. A. Stephens, “EDF Statistics for Goodness of Fit and Some Comparisons,” Journal of the American Statistical Association, vol. 69, no. 347. pp. 730–737, 1974.
R. D’Agostino and M. Stephens, Goodness-of-Fit Techniques. New York: Marcel Dekker, 1986.
Examples
library(dplyr)
carbon.fabric %>%
filter(test == "FC") %>%
filter(condition == "RTD") %>%
anderson_darling_normal(strength)
## Call:
## anderson_darling_normal(data = ., x = strength)
##
## Distribution: Normal ( n = 18 )
## Test statistic: A = 0.9224776
## OSL (p-value): 0.01212193 (assuming unknown parameters)
## Conclusion: Sample is not drawn from a Normal distribution (alpha = 0.05)