R: Cross validation for Approximate Bayesian Computation (ABC)

cv4abc {abc}

R Documentation

Cross validation for Approximate Bayesian Computation (ABC)

Description

This function performs a leave-one-out cross validation for ABC via subsequent calls to the function abc. A potential use of this function is to evaluate the effect of the choice of the tolerance rate on the quality of the estimation with ABC.

Usage

cv4abc(param, sumstat, abc.out = NULL, nval, tols, statistic = "median",
prior.range = NULL, method, hcorr = TRUE, transf = "none", logit.bounds
= c(0,0), subset = NULL, kernel = "epanechnikov", numnet = 10, sizenet =
5, lambda = c(0.0001,0.001,0.01), trace = FALSE, maxit = 500, ...)

Arguments

`param`	a vector, matrix or data frame of the simulated parameter values.
`sumstat`	a vector, matrix or data frame of the simulated summary statistics.
`abc.out`	an object of class `"abc"`, optional. If supplied, all arguments passed to `abc` are extracted from this object, except for `sumstat`, `param`, and `tol`, which always have to be supplied as arguments.
`nval`	size of the cross-validation sample.
`tols`	a single tolerance rate or a vector of tolerance rates.
`statistic`	a character string specifying the statistic to calculate a point estimate from the posterior distribution of the parameter(s). Possible values are `"median"` (default), `"mean"`, or `"mode"`.
`prior.range`	a range to truncate the prior range.
`method`	a character string indicating the type of ABC algorithm to be applied. Possible values are `"rejection"`, `"loclinear"`, and `"neuralnet"`. See also `abc`.
`hcorr`	logical, if `TRUE` (default) the conditional heteroscedastic model is applied.
`transf`	a vector of character strings indicating the kind of transformation to be applied to the parameter values. The possible values are `"log"`, `"logit"`, and `"none"` (default), when no is transformation applied. See also `abc`.
`logit.bounds`	a vector of bounds if `transf` is `"logit"`. These bounds are applied to all parameters that are to be logit transformed.
`subset`	a logical expression indicating elements or rows to keep. Missing values in `param` and/or `sumstat` are taken as `FALSE`.
`kernel`	a character string specifying the kernel to be used when `method` is `"loclinear"` or `"neuralnet"`. Defaults to `"epanechnikov"`. See `density` for details.
`numnet`	the number of neural networks when `method` is `"neuralnet"`. Defaults to 10. It indicates the number of times the function `nnet` is called.
`sizenet`	the number of units in the hidden layer. Defaults to 5. Can be zero if there are no skip-layer units. See `nnet` for more details.
`lambda`	a numeric vector or a single value indicating the weight decay when `method` is `"neuralnet"`. See `nnet` for more details. By default, 0.0001, 0.001, or 0.01 is randomly chosen for each of the networks.
`trace`	logical, `TRUE` switches on tracing the optimization of `nnet`. Applies only when `method` is `"neuralnet"`.
`maxit`	numeric, the maximum number of iterations. Defaults to 500. Applies only when `method` is `"neuralnet"`. See also `nnet`.
`...`	other arguments passed to `nnet`.

Details

A simulation is selected repeatedly to be a validation simulation, while the other simulations are used as training simulations. Each time the function abc is called to estimate the parameter(s). A total of nval validation simulations are selected.

The arguments of the function abc can be supplied in two ways. First, simply give them as arguments when calling this function, in which case abc.out can be NULL. Second, via an existing object of class "abc", here abc.out. WARNING: when abc.out is supplied, the same sumstat and param objects have to be used as in the original call to abc. Column names of sumstat and param are checked for match.

See summary.cv4abc for calculating the prediction error from an object of class "cv4abc".

Value

An object of class "cv4abc", which is a list with the following elements

`call`	The original calls to `abc` for each tolerance rates.
`cvsamples`	Numeric vector of length `nval`, indicating which rows of the `param` and `sumstat` matrices were used as validation values.
`tols`	The tolerance rates.
`true`	The parameter values that served as validation values.
`estim`	The estimated parameter values.
`names`	A list with two elements: `parameter.names` and `statistics.names`. Both contain a vector of character strings with the parameter and statistics names, respectively.
`seed`	The value of `.Random.seed` when `cv4abc` is called.

Examples

require(abc.data)
data(musigma2)
## this data set contains five R objects, see ?musigma2 for
## details 

## cv4abc() calls abc(). Here we show two ways for the supplying
## arguments of abc(). 1st way: passing arguments directly. In this
## example only 'param', 'sumstat', 'tol', and 'method', while default
## values are used for the other arguments.
## Number of eval. should be much more greater in realistic settings
cv.rej <- cv4abc(param=par.sim, sumstat=stat.sim, nval=5,
tols=c(.1,.2,.3), method="rejection")

## 2nd way: first creating an object of class 'abc', and then using it
## to pass its arguments to abc().
##
lin <- abc(target=stat.obs, param=par.sim, sumstat=stat.sim, tol=.2,
method="loclinear", transf=c("none","log"))
cv.lin <- cv4abc(param=par.sim, sumstat=stat.sim, abc.out=lin, nval=5,
tols=c(.1,.2,.3))

## using the plot method. Different tolerance levels are plotted with
## different heat.colors. Smaller the tolerance levels correspond to
## "more red" points.
## !!! consider using the argument 'exclude' (plot.cv4abc) to supress
## the plotting of any outliers that mask readibility !!!
plot(cv.lin, log=c("xy", "xy"), caption=c(expression(mu),
expression(sigma^2)))

## comparing with the rejection sampling
plot(cv.rej, log=c("", "xy"), caption=c(expression(mu), expression(sigma^2)))

## or printing results directly to a postscript file...
## Not run: 
plot(cv.lin, log=c("xy", "xy"), caption=c(expression(mu),
expression(sigma^2)), file="CVrej", postscript=TRUE)

## End(Not run)

## using the summary method to calculate the prediction error
summary(cv.lin)
## compare with rejection sampling
summary(cv.rej)

[Package abc version 2.2.1 Index]