GET.composite {GET} | R Documentation |
Adjusted global envelope tests
Description
Adjusted global envelope tests for composite null hypothesis.
Usage
GET.composite(
X,
X.ls = NULL,
nsim = 499,
nsimsub = nsim,
simfun = NULL,
fitfun = NULL,
calcfun = function(X) {
X
},
testfuns = NULL,
...,
type = "erl",
alpha = 0.05,
alternative = c("two.sided", "less", "greater"),
probs = c(0.025, 0.975),
r_min = NULL,
r_max = NULL,
take_residual = FALSE,
save.cons.envelope = savefuns,
savefuns = FALSE,
verbose = TRUE,
MrkvickaEtal2017 = FALSE,
mc.cores = 1L
)
Arguments
X |
An object containing the data in some form.
A |
X.ls |
A list of objects as |
nsim |
The number of simulations to be generated in the primary test.
Ignored if |
nsimsub |
Number of simulations in each basic test. There will be |
simfun |
A function for generating simulations from the null model. If given, this function
is called by |
fitfun |
A function for estimating the parameters of the null model.
The function should return the fitted model in the form that it can be directly
passed to |
calcfun |
A function for calculating a summary function from a simulation of the model.
The default is the identity function, i.e. the simulations from the model are functions themselves.
The use of |
testfuns |
A list of lists of parameters to be passed to the |
... |
Additional parameters to the |
type |
The type of the global envelope with current options for 'rank', 'erl', 'cont', 'area', 'qdir', 'st' and 'unscaled'. See details. |
alpha |
The significance level. The 100(1-alpha)% global envelope will be calculated under the 'fwer' or 'fdr' control. If a vector of values is provided, the global envelopes are calculated for each value. |
alternative |
A character string specifying the alternative hypothesis.
Must be one of the following: "two.sided" (default), "less" or "greater".
The last two options only available for types |
probs |
A two-element vector containing the lower and upper quantiles for the measure 'q' or 'qdir', in that order and on the interval [0, 1]. The default values are 0.025 and 0.975, suggested by Myllymäki et al. (2015, 2017). |
r_min |
The minimum argument value to include in the test. |
r_max |
The maximum argument value to include in the test.
in calculating functions by the |
take_residual |
Logical. If TRUE (needed for visual reasons only) the mean of the simulated functions is reduced from the functions in each first and second stage test. |
save.cons.envelope |
Logical flag indicating whether to save the unadjusted envelope test results. |
savefuns |
Logical flag indicating whether to save all the simulated function values.
Similar to the same argument of the |
verbose |
Logical flag indicating whether to print progress reports during the simulations.
Similar to the same argument of |
MrkvickaEtal2017 |
Logical. If TRUE, type is "st" or "qdir" and several test functions are used,
then the combined scaled MAD envelope presented in Mrkvička et al. (2017) is calculated. Otherwise,
the two-step procedure described in |
mc.cores |
The number of cores to use, i.e. at most how many child processes will be run simultaneously.
Must be at least one, and parallelization requires at least two cores. On a Windows computer mc.cores must be 1
(no parallelization). For details, see |
Details
The specification of X, X.ls, fitfun, simfun is important:
If
X.ls
is provided, then the global envelope test is calculated based on functions in these objects.X
should be acurve_set
object, or anenvelope
object of spatstat, including the observed function and simulations from the tested model.X.ls
should be a list ofcurve_set
or envelope (of R package spatstat) objects, where each component contains an "observed" function f that has been simulated from the model fitted to the data and the simulations that have been obtained from the same model that has been fitted to the "observed" f. The user has the responsibility that the functions have been generated correctly, the test is done based on these provided simulations. See the examples.Otherwise, if
simfun
andfitfun
are provided,X
can be general. The functionfitfun
is used for fitting the desired model M and the functionsimfun
for simulating from a fitted model M. These functions should be coupled with each other such that the object returned byfitfun
is directly accepted as the (single) argument insimfun
. In the case, that the global envelope should not be calculated directly forX
(X
is not a function),calcfun
can be used to specify how to calculate the function fromX
and from simulations generated bysimfun
. Special attention is needed in the correct specification of the functions, see examples.Otherwise,
X
should be either a fitted (point process) model object or appp
object of the R package spatstat.If
X
is a fitted (point process) model object of the R package spatstat, then the simulations from this model are generated and summary functions for testing calculated by theenvelope
function of spatstat. Which summary function to use and how to calculate it, can be passed toenvelope
either in...
ortestfuns
. Unless otherwise specified the default function ofenvelope
, i.g. the K-function, is used. The argumenttestfuns
should be used to specify the test functions in the case where one wants to base the test on several test functions.If
X
is appp
object of spatstat, then theenvelope
function is used for simulations and model fitting and the complete spatial randomness (CSR) is tested (with intensity parameter).
For the rank envelope test, the global envelope test is the test described in Myllymäki et al. (2017) with the adjustment of Baddeley et al. (2017). For other test types, the test (also) uses the two-stage procedure of Dao and Genton (2014) with the adjustment of Baddeley et al. (2017) as descripbed in Myllymäki and Mrkvička (2020).
See examples also in saplings
.
Value
An object of class global_envelope
or combined_global_envelope
, which can be
printed and plotted directly. See global_envelope_test
.
References
Baddeley, A., Hardegen, A., Lawrence, T., Milne, R. K., Nair, G. and Rakshit, S. (2017). On two-stage Monte Carlo tests of composite hypotheses. Computational Statistics and Data Analysis 114: 75-87. doi: http://dx.doi.org/10.1016/j.csda.2017.04.003
Dao, N.A. and Genton, M. (2014). A Monte Carlo adjusted goodness-of-fit test for parametric models describing spatial point patterns. Journal of Graphical and Computational Statistics 23, 497-517.
Mrkvička, T., Myllymäki, M. and Hahn, U. (2017) Multiple Monte Carlo testing, with applications in spatial point processes. Statistics & Computing 27(5): 1239-1255. doi: 10.1007/s11222-016-9683-9
Myllymäki, M., Mrkvička, T., Grabarnik, P., Seijo, H. and Hahn, U. (2017). Global envelope tests for spatial point patterns. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 79: 381-404. doi: 10.1111/rssb.12172
Myllymäki, M. and Mrkvička, T. (2020). GET: Global envelopes in R. arXiv:1911.06583 [stat.ME]. https://doi.org/10.48550/arXiv.1911.06583
See Also
global_envelope_test
, plot.global_envelope
, saplings
Examples
# Graphical normality test (Myllymaki and Mrkvicka, 2020, Section 3.3.)
#=========================
if(require("fda.usc", quietly=TRUE)) {
data("poblenou")
dat <- poblenou[['nox']][['data']][,'H10']
n <- length(dat)
# The number of simulations
nsim <- nsimsub <- 199
set.seed(200127)
# General setup
#==============
# 1. Fit the model
mu <- mean(dat)
sigma <- sd(dat)
# 2. Simulate a sample from the fitted null model and
# compute the test vectors for data (obs) and each simulation (sim)
r <- seq(min(dat), max(dat), length=100)
obs <- stats::ecdf(dat)(r)
sim <- sapply(1:nsimsub, function(i) {
x <- rnorm(n, mean = mu, sd = sigma)
stats::ecdf(x)(r)
})
cset <- create_curve_set(list(r = r, obs = obs, sim_m = sim))
# 3. Simulate another sample from the fitted null model.
# 4. Fit the null model to each of the patterns,
# simulate a sample from the null model,
# and compute the test vectors for all.
cset.ls <- vector("list", nsim)
for(rep in 1:nsim) {
x <- rnorm(n, mean = mu, sd = sigma)
mu2 <- mean(x)
sigma2 <- sd(x)
obs2 <- stats::ecdf(x)(r)
sim2 <- sapply(1:nsimsub, function(i) {
x2 <- rnorm(n, mean = mu2, sd = sigma2)
stats::ecdf(x2)(r)
})
cset.ls[[rep]] <- create_curve_set(list(r = r, obs = obs2, sim_m = sim2))
}
# Perform the adjusted test
res <- GET.composite(X = cset, X.ls = cset.ls, type = 'erl')
plot(res) + ggplot2::labs(x = "NOx", y = "Ecdf")
}
# Example of a point pattern data
#================================
# Test the fit of a Matern cluster process.
if(require("spatstat.model", quietly=TRUE)) {
data("saplings")
saplings <- as.ppp(saplings, W = square(75))
# First choose the r-distances
rmin <- 0.3; rmax <- 10; rstep <- (rmax-rmin)/500
r <- seq(0, rmax, by = rstep)
# Number of simulations
nsim <- 19 # Increase nsim for serious analysis!
# Option 1: Give the fitted model object to GET.composite
#--------------------------------------------------------
# This can be done and is preferable when the model is
# a point process model of spatstat.
# 1. Fit the Matern cluster process to the pattern
# (using minimum contrast estimation with the K-function)
M1 <- kppm(saplings~1, clusters = "MatClust", statistic = "K")
summary(M1)
# 2. Make the adjusted global area rank envelope test using the L(r)-r function
adjenvL <- GET.composite(X = M1, nsim = nsim,
testfuns = list(L =list(fun="Lest", correction="translate",
transform=expression(.-r), r=r)), # passed to envelope
type = "area", r_min = rmin, r_max = rmax)
# Plot the test result
plot(adjenvL)
# Option 2: Generate the simulations "by yourself"
#-------------------------------------------------
# and provide them as curve_set or envelope objects
# Preferable when you want to have a control
# on the simulations yourself.
# 1. Fit the model
M1 <- kppm(saplings~1, clusters = "MatClust", statistic = "K")
# 2. Generate nsim simulations by the given function using the fitted model
X <- envelope(M1, nsim = nsim, savefuns = TRUE,
fun = "Lest", correction = "translate",
transform = expression(.-r), r = r)
plot(X)
# 3. Create another set of simulations to be used to estimate
# the second-state p-value (as proposed by Baddeley et al., 2017).
simpatterns2 <- simulate(M1, nsim = nsim)
# 4. Calculate the functions for each pattern
simf <- function(rep) {
# Fit the model to the simulated pattern Xsims[[rep]]
sim_fit <- kppm(simpatterns2[[rep]], clusters = "MatClust", statistic = "K")
# Generate nsim simulations from the fitted model
envelope(sim_fit, nsim = nsim, savefuns = TRUE,
fun = "Lest", correction = "translate",
transform = expression(.-r), r = r)
}
X.ls <- parallel::mclapply(X = 1:nsim, FUN = simf, mc.cores = 1) # list of envelope objects
# 5. Perform the adjusted test
res <- GET.composite(X = X, X.ls = X.ls, type = "area",
r_min = rmin, r_max = rmax)
plot(res)
}