central_region {GET}R Documentation

Central region / Global envelope

Description

Provides central regions or global envelopes or confidence bands

Usage

central_region(
  curve_sets,
  type = "erl",
  coverage = 0.5,
  alternative = c("two.sided", "less", "greater"),
  probs = c(0.25, 0.75),
  quantile.type = 7,
  central = "median",
  nstep = 2,
  ...
)

Arguments

curve_sets

A curve_set object or a list of curve_set objects. Also envelope objects of spatstat and fdata of fda.usc are accepted instead of curve_set objects.

type

The type of the global envelope with current options for 'rank', 'erl', 'cont', 'area', 'qdir', 'st' and 'unscaled'. See details.

coverage

A number between 0 and 1. The 100*coverage% central region will be calculated. A vector of values can also be provided, leading to the corresponding number of central regions.

alternative

A character string specifying the alternative hypothesis. Must be one of the following: "two.sided" (default), "less" or "greater". The last two options only available for types 'rank', 'erl', 'cont' and 'area'.

probs

A two-element vector containing the lower and upper quantiles for the measure 'q' or 'qdir', in that order and on the interval [0, 1]. The default values are 0.025 and 0.975, suggested by Myllymäki et al. (2015, 2017).

quantile.type

As type argument of quantile, how to calculate quantiles for 'q' or 'qdir'.

central

Either "mean" or "median". If the curve sets do not contain the component theo for the theoretical central function, then the central function (used for plotting only) is calculated either as the mean or median of functions provided in the curve sets. For 'qdir', 'st' and 'unscaled' only the mean is allowed as an option, due to their definition.

nstep

1 or 2 for how to contruct a combined global envelope if list of curve sets is provided. 2 (default) for a two-step combining procedure, 1 for one-step.

...

Ignored.

Details

Given a curve_set object, or an envelope object of spatstat or fdata object of fda.usc, the function central_region constructs a central region, i.e. a global envelope, from the given set of functions (or vectors).

Generally an envelope is a band bounded by the vectors (or functions) T_{low} and T_{hi}. A 100(1-\alpha)% or 100*coverage% global envelope is a set (T_{low}, T_{hi}) of envelope vectors such that the probability that T_i falls outside this envelope in any of the d points of the vector T_i is less or equal to \alpha. The global envelopes can be constructed based on different measures that order the functions from the most extreme one to the least extreme one. We use such orderings of the functions for which we are able to construct global envelopes with intrinsic graphical interpretation.

The type of the global envelope can be chosen with the argument type and the options are given in the following. Further information about the measures, on which the global envelopes are based, can be found in Myllymäki and Mrkvička (2020, Section 2.).

The values of the chosen measure M are determined for each curve in the curve_set, and based on the chosen measure, the central region, i.e. the global envelope, is constructed for the given curves.

If a list of (suitable) objects are provided in the argument curve_sets, then by default (nstep = 2) the two-step combining procedure is used to construct the combined global envelope as described in Myllymäki and Mrkvička (2020, Section 2.2.). If nstep = 1 and the lengths of the multivariate vectors in each component of the list are equal, then the one-step combining procedure is used where the functions are concatenated together into a one long vector (see again Myllymäki and Mrkvička, 2020, Section 2.2.).

Value

Either an object of class global_envelope and or an combined_global_envelope object. The former class is obtained when a set of curves is provided, while the latter in the case that curve_sets is a list of objects. The print and plot function are defined for the returned objects (see examples).

The global_envelope object is essentially a data frame containing columns

and potentially additionally

(Most often central_region is directly applied to functional data where all curves are observed.) Additionally, the returned object has some attributes, where

Further the object has some attributes for printing and plotting purposes, where alternative, type, ties, alpha correspond to those in the function call and method gives a name for the method. Attributes of an object res can be obtained using the function attr, e.g. attr(res, "M") for the values of the ordering measure.

If the given set of curves had the class envelope of spatstat, then the returned global_envelope object has also the class fv of spatstat, whereby one can utilize also the plotting functions of spatstat, see example in plot.global_envelope. However, the envelope objects are most often used with global_envelope_test and not with central_region. For an fv object, also some further attributes exists as required by fv of spatstat.

The combined_global_envelope is a list of global_envelope objects, where the components correspond to the components of curve_sets. The combined_global_envelope object constructed with nstep = 2 contains, in addition to some conventional ones (method, alternative, type, alpha, M, M_alpha, see above), the second level envelope information as the attributes

In the case that the given curve sets are two-dimensional, i.e., their arguments values are two-dimensional, then the returned objects have in addition to the class global_envelope or combined_global_envelope, the class global_envelope2d or combined_global_envelope2d, respectively. This class is assigned for plotting purposes: For the 2d envelopes, also the default plots are 2d. Otherwise the 1d and 2d objects are similar.

References

Mrkvička, T., Myllymäki, M., Jilek, M. and Hahn, U. (2020) A one-way ANOVA test for functional data with graphical interpretation. Kybernetika 56(3), 432-458. doi: 10.14736/kyb-2020-3-0432

Mrkvička, T., Myllymäki, M., Kuronen, M. and Narisetty, N. N. (2022) New methods for multiple testing in permutation inference for the general linear model. Statistics in Medicine 41(2), 276-297. doi: 10.1002/sim.9236

Myllymäki, M., Grabarnik, P., Seijo, H. and Stoyan. D. (2015). Deviation test construction and power comparison for marked spatial point patterns. Spatial Statistics 11, 19-34. doi: 10.1016/j.spasta.2014.11.004

Myllymäki, M., Mrkvička, T., Grabarnik, P., Seijo, H. and Hahn, U. (2017). Global envelope tests for spatial point patterns. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 79, 381-404. doi: 10.1111/rssb.12172

Myllymäki, M. and Mrkvička, T. (2020). GET: Global envelopes in R. arXiv:1911.06583 [stat.ME]. https://doi.org/10.48550/arXiv.1911.06583

Ripley, B.D. (1981). Spatial statistics. Wiley, New Jersey.

See Also

forder, global_envelope_test

Examples

## A central region of a set of functions
#----------------------------------------
if(requireNamespace("fda", quietly=TRUE)) {
  cset <- curve_set(r=as.numeric(row.names(fda::growth$hgtf)),
                    obs=fda::growth$hgtf)
  plot(cset) + ggplot2::ylab("height")
  cr <- central_region(cset, coverage=0.50, type="erl")
  plot(cr)
}

## Confidence bands for linear or polynomial regression
#------------------------------------------------------
# Simulate regression data according to the cubic model
# f(x) = 0.8x - 1.8x^2 + 1.05x^3 for x in [0,1]
par <- c(0,0.8,-1.8,1.05) # Parameters of the true polynomial model
res <- 100 # Resolution
x <- seq(0, 1, by=1/res); x2=x^2; x3=x^3;
f <- par[1] + par[2]*x + par[3]*x^2 + par[4]*x^3 # The true function
d <- f + rnorm(length(x), 0, 0.04) # Data
# Plot the true function and data
plot(f, type="l", ylim=range(d))
points(d)

# Estimate polynomial regression model
reg <- lm(d ~ x + x2 + x3)
ftheta <- reg$fitted.values
resid0 <- reg$residuals
s0 <- sd(resid0)

# Bootstrap regression
B <- 2000 # Number of bootstrap samples

ftheta1 <- array(0, c(B,length(x)))
s1 <- array(0,B)
for(i in 1:B) {
  u <- sample(resid0, size=length(resid0), replace=TRUE)
  reg1 <- lm((ftheta+u) ~ x + x2 + x3)
  ftheta1[i,] <- reg1$fitted.values
  s1[i] <- sd(reg1$residuals)
}

# Centering and scaling
meanftheta <- apply(ftheta1, 2, mean)
m <- array(0, c(B,length(x)))
for(i in 1:B) { m[i,] <- (ftheta1[i,]-meanftheta)/s1[i] }

# Central region computation
boot.cset <- curve_set(r=1:length(x), obs=ftheta+s0*t(m))
cr <- central_region(boot.cset, coverage=c(0.50, 0.80, 0.95), type="erl")

# Plotting the result
plot(cr) + ggplot2::labs(x=expression(italic(x)), y=expression(italic(f(x)))) +
  ggplot2::geom_point(data=data.frame(id=1:length(d), points=d),
                      ggplot2::aes(x=id, y=points)) + # data points
  ggplot2::geom_line(data=data.frame(id=1:length(d), points=f),
                     ggplot2::aes(x=id, y=points)) # true function

[Package GET version 1.0-2 Index]