R: Goodness of fit

gfit {abc}

R Documentation

Goodness of fit

Description

Perform a test for goodness-of-fit.

Usage

gfit(target, sumstat, nb.replicate, tol=.01, statistic=mean, subset=NULL, 
trace=FALSE)

Arguments

`target`	a data frame or vector of the observed summary statistic.
`sumstat`	a vector, matrix or data frame of the simulated summary statistics.
`nb.replicate`	number of replicates used to estimate the null distribution of the goodness-of-fit statistic.
`tol`	a tolerance rate. Defaults to 0.01
`statistic`	define the goodness-of-fit statistic. Typical values are `median` (default), `mean`, or `max`. When using `median` fo instance, the g.o.f. statistic is the median of the distance between observed and accepted summary statistics.
`subset`	optional. A logical expression indicating elements or rows to keep. Missing values in `sumstat` are taken as `FALSE`.
`trace`	a boolean indicating if a trace should be displayed when calling the function. Default to `FALSE`.

Details

The null distribution is estimated using already performed simulations contained in sumstat as pseudo-observed datasets. For each pseudo-observed dataset, the rejection algorithm is performed to obtain a value of the goodness-of-fit statistic. A better estimate of the P-value is obtained for larger nb.replicate but the running time of the function is increased.

Value

An object of class "gfit", which is a list with the following elements

`dist.obs`	the value of the goodness-of-fit statistic for the data.
`dist.sim`	a vector of size `nb.replicate`. It is a sample of the null distribution of the test statistic.

Author(s)

Louisiane Lemaire and Michael Blum.

Examples

## human demographic history
require(abc.data)
data(human)

##  Perform a test of goodness-of-fit.
##  The data are the European data and we test the fit of the bottleneck 
## model (good fit) and of the constant-size population model (poor fit) 
## Use larger values of \code{nb.replicate} (e.g. 1000)
## for real applications
res.gfit.bott=gfit(target=stat.voight["italian",], 
sumstat=stat.3pops.sim[models=="bott",], statistic=mean, nb.replicate=10)
res.gfit.const=gfit(target=stat.voight["italian",], 
sumstat=stat.3pops.sim[models=="const",], statistic=mean, nb.replicate=10)


## Plot the distribution of the null statistic and indicate where is the 
## observed value.
plot(res.gfit.bott, main="Histogram under H0")
## Call the function \code{summary}
## It computes the P-value, call \code{summary} on the vector 
## \code{dist.sim} and returns the value of the observed statistic
summary(res.gfit.bott)

[Package abc version 2.2.1 Index]