R: Compositional Goodness of fit test

gof {compositions}

R Documentation

Compositional Goodness of fit test

Description

Goodness of fit tests for compositional data.

Usage

acompGOF.test(x,...)
acompNormalGOF.test(x,...,method="etest")
## S3 method for class 'formula'
acompGOF.test(formula, data,...,method="etest")
## S3 method for class 'list'
acompGOF.test(x,...,method="etest")
gsi.acompUniformityGOF.test(x,samplesize=nrow(x)*20,R=999)
acompTwoSampleGOF.test(x,y,...,method="etest",data=NULL)

Arguments

`x`	a dataset of compositions (acomp)
`y`	a dataset of compositions (acomp)
`samplesize`	number of observations in a reference sample specifying the distribution to compare with. Typically substantially larger than the sample under investigation
`R`	The number of replicates to compute the distribution of the test statistic
`method`	Selecting a method to be used. Currently only "etest" for using an energy test is supported.
`...`	further arguments to the methods
`formula`	an anova model formula defining groups in the dataset
`data`	unused

Details

The compositional goodness of fit testing problem is essentially a multivariate goodness of fit test. However there is a lack of standardized multivariate goodness of fit tests in R. Some can be found in the energy-package.

In principle there is only one test behind the Goodness of fit tests provided here, a two sample test with test statistic.

\frac{\sum_{ij} k(x_i,y_i)}{\sqrt{\sum_{ij} k(x_i,x_i)\sum_{ij} k(y_i,y_i)}}

The idea behind that statistic is to measure the cos of an angle between the distributions in a scalar product given by

(X,Y)=E[k(X,Y)]=E[\int K(x-X)K(x-Y) dx]

where k and K are Gaussian kernels with different spread. The bandwith is actually the standarddeviation of k.
The other goodness of fit tests against a specific distribution are based on estimating the parameters of the distribution, simulating a large dataset of that distribution and apply the two sample goodness of fit test.

For the moment, this function covers: two-sample tests, uniformity tests and additive logistic normality tests. Dirichlet distribution tests will be included soon.

Value

A classical "htest" object

`data.name`	The name of the dataset as specified
`method`	a name for the test used
`alternative`	an empty string
`replicates`	a dataset of p-value distributions under the Null-Hypothesis got from nonparametric bootstrap
`p.value`	The p.value computed for this test

Missing Policy

Up to now the tests can not handle missings.

Author(s)

K.Gerald v.d. Boogaart http://www.stat.boogaart.de

References

Aitchison, J. (1986) The Statistical Analysis of Compositional Data Monographs on Statistics and Applied Probability. Chapman & Hall Ltd., London (UK). 416p.

Examples

## Not run: 
x <- runif.acomp(100,4)
y <- runif.acomp(100,4)

erg <- acompTwoSampleGOF.test(x,y)
#continue
erg
unclass(erg)
erg <- acompGOF.test(x,y)


x <- runif.acomp(100,4)
y <- runif.acomp(100,4)
dd <- replicate(1000,acompGOF.test(runif.acomp(100,4),runif.acomp(100,4))$p.value)
hist(dd)

dd <- replicate(1000,acompGOF.test(runif.acomp(20,4),runif.acomp(100,4))$p.value)
hist(dd)
dd <- replicate(1000,acompGOF.test(runif.acomp(10,4),runif.acomp(100,4))$p.value)

hist(dd)
dd <- replicate(1000,acompGOF.test(runif.acomp(10,4),runif.acomp(400,4))$p.value)
hist(dd)
dd <- replicate(1000,acompGOF.test(runif.acomp(400,4),runif.acomp(10,4),bandwidth=4)$p.value)
hist(dd)


dd <- replicate(1000,acompGOF.test(runif.acomp(20,4),runif.acomp(100,4)+acomp(c(1,2,3,1)))$p.value)

hist(dd)

# test uniformity

attach("gsi") # the uniformity test is only available as an internal function
x <- runif.acomp(100,4)
gsi.acompUniformityGOF.test.test(x)

dd <- replicate(1000,gsi.acompUniformityGOF.test.test(runif.acomp(10,4))$p.value)
hist(dd)
detach("gsi")


## End(Not run)

[Package compositions version 2.0-8 Index]