R: Monte Carlo simulation of dissimilarities

mcarlo {analogue}

R Documentation

Monte Carlo simulation of dissimilarities

Description

Permutations and Monte Carlo simulations to define critical values for dissimilarity coefficients for use in MAT reconstructions.

Usage

mcarlo(object, ...)

## Default S3 method:
mcarlo(object, nsamp = 10000,
       type = c("paired", "complete", "bootstrap", "permuted"),
       replace = FALSE, 
       method = c("euclidean", "SQeuclidean", "chord", "SQchord",
                  "bray", "chi.square", "SQchi.square",
                  "information", "chi.distance", "manhattan",
                  "kendall", "gower", "alt.gower", "mixed"),
       is.dcmat = FALSE, diag = FALSE, ...)

## S3 method for class 'mat'
mcarlo(object, nsamp = 10000,
       type = c("paired", "complete", "bootstrap", "permuted"),
       replace = FALSE, diag = FALSE, ...)

## S3 method for class 'analog'
mcarlo(object, nsamp = 10000,
       type = c("paired", "complete", "bootstrap", "permuted"),
       replace = FALSE, diag = FALSE, ...)

Arguments

`object`	an R object. Currently only object's of class `"mat"`, `"analog"` or matrix-like object of species data allowed.
`nsamp`	numeric; number of permutations or simulations to draw.
`type`	character; the type of permutation or simulation to perform. See Details, below.
`replace`	logical; should sampling be done with replacement?
`method`	character; for raw species matrices, the dissimilarity coefficient to use. This is predefined when fitting a MAT model with `mat` or analogue matching via `analogue` and is ignored in the `"mcarlo"` methods for classes `"mat"` and `"analog"`.
`is.dcmat`	logical; is `"object"` a dissimilarity matrix. Not meant for general use; used internally by `"mat"` and `"analogue"` methods to instruct the `"default"` method that `"object"` is already a dissimilarity matrix, so there is no need to recalculate.
`diag`	logical; should the dissimilarities include the diagonal (zero) values of the dissimilarity matrix. See Details.
`...`	arguments passed to or from other methods.

Details

Only "type" "paired" and "bootstrap" are currently implemented.

distance produces square, symmetric dissimilarity matrices for training sets. The upper triangle of these matrices is a duplicate of the lower triangle, and as such is redundant. mcarlo works on the lower triangle of these dissimilarity matrices, representing all pairwise dissimilarity values for training set samples. The default is not to include the diagonal (zero) values of the dissimilarity matrix. If you feel that these diagonal (zero) values are part of the population of dissimilarities then use "diag = TRUE" to include them in the permutations.

Value

A vector of simulated dissimilarities of length "nsamp". The "method" used is stored in attribute "method".

Note

The performance of these permutation and simulation techniques still needs to be studied. This function is provided for pedagogic reasons. Although recommended by Sawada et al (2004), sampling with replacement ("replace = TRUE") and including diagonal (zero) values ("diag = TRUE") simulates too many zero distances. This is because the same training set sample can, on occasion be drawn twice leading to a zero distance. It is impossible to find in nature two samples that will be perfectly similar, and as such sampling with replacement and "diag = TRUE" seems undesirable at best.

Author(s)

Gavin L. Simpson

References

Sawada, M., Viau, A.E., Vettoretti, G., Peltier, W.R. and Gajewski, K. (2004) Comparison of North-American pollen-based temperature and global lake-status with CCCma AGCM2 output at 6 ka. Quaternary Science Reviews 23, 87–108.

Examples

## Imbrie and Kipp example
## load the example data
data(ImbrieKipp)
data(SumSST)
data(V12.122)

## merge training and test set on columns
dat <- join(ImbrieKipp, V12.122, verbose = TRUE)

## extract the merged data sets and convert to proportions
ImbrieKipp <- dat[[1]] / 100
V12.122 <- dat[[2]] / 100

## perform the modified method of Sawada (2004) - paired sampling,
## with replacement
ik.mcarlo <- mcarlo(ImbrieKipp, method = "chord", nsamp = 1000,
                    type = "paired", replace = FALSE)
ik.mcarlo

## plot the simulated distribution
layout(matrix(1:2, ncol = 1))
plot(ik.mcarlo)
layout(1)

[Package analogue version 0.17-6 Index]