ADtest {MSCsimtester} | R Documentation |
Anderson-Darling test comparing sample and theoretical pairwise distance distributions.
Description
Takes as input theoretical pairwise distance densities under the MSC and
empirical pairwise distances from gene trees in a sample, as returned by
the function pairwiseDist
. Uses the package kSamples
to perform
either one test on the entire dataset or multiple tests on subsamples.
Usage
ADtest(distanceDensities, subsampleSize = FALSE)
Arguments
distanceDensities |
A list containing values needed for performing Anderson-Darling
test(s) on a gene tree sample and species tree, as output by |
subsampleSize |
A positive integer to perform multiple tests on subsamples,
or |
Details
The Anderson-Darling test compares the empirical distance distribution for a supplied gene tree
sample to a sample drawn from the theoretical distribution. The output, passed from the kSamples
package,
thus says that 2 samples are being compared, to test a null-hypothesis that they come from the same distribution.
See kSamples
documentation for function ad.test
for more details.
Repeated runs of this function will give different results, since the sample from the theoretical distribution will vary. Under the null hypothesis p-values for different runs should be approximately uniformly distributed.
Numerical issues may result in poor performance of Anderson-Darling tests when the sample size
is very large, so
an optional parameter subsampleSize
can be set to create subsamples of smaller size.
If subsampleSize
is a positive integer,
Anderson-Darling tests are performed on each subset, comparing them to
a random sample of the same size from the theoretical distribution. Good fit is indicated by an approximately uniform
distribution of the subsample p-values.
Value
An object of type ADtestOutput
including a sample $Sample
from the theoretical distance distribution of
the same size as the empirical one, and $ADtest
which is of type kSamples
and
has all output from the Anderson-Darling test if only
one test was performed, or the number of tests if tests were performed on subsamples.
See Also
pairwiseDist
, kSamples-package
Examples
stree=read.tree(text="((((a:10000,b:10000):10000,c:20000):10000,d:30000):10000,e:40000);")
pops=c(15000,25000,10000,1,1,1,1,1,12000)
gts=read.tree(file=system.file("extdata","genetreeSample",package="MSCsimtester"))
distDen=pairwiseDist(stree,pops,gts,"a","b")
ADtest(distDen)
ADtest(distDen,1000)