qqplot_maha {nvmix}R Documentation

QQ Plot for Multivariate Normal Variance Mixtures

Description

Visual goodness-of-fit test for multivariate normal variance mixtures: Plotting squared Mahalanobis distances against their theoretical quantiles.

Usage

qqplot_maha(x, qmix, loc, scale, fitnvmix_object,
            trafo.to.normal = FALSE, test = c("KS.AD", "KS", "AD", "none"),
            boot.pars = list(B = 500, level = 0.95),
            plot = TRUE, verbose = TRUE, control = list(),
            digits = max(3, getOption("digits") - 4), plot.pars = list(), ...)

Arguments

x

(n, d)-data matrix.

qmix

see pnvmix().

loc

see pnvmix().

scale

see pnvmix().

fitnvmix_object

Optional. Object of class "fitnvmix" typically returned by fitnvmix(); if provided, x, qmix, loc and scale are ignored.

trafo.to.normal

logical. If TRUE, the underlying Mahalanobis distances are mapped to normals by a probability- quantile-transform so that the resulting QQ plot is essentially a normal QQ plot. Defaults to FALSE.

test

character specifying if (and which) GoF test shall be performed. "KS" performs a Kolmogorov-Smirnoff (see ks.test()), "AD" an Anderson-Darling test (see ad.test() from the package ADGofTest and "none" performs no test. By default, test = "KS.AD" in which case both tests are performed.

boot.pars

list with elements B (Bootstrap sample size for computing CIs; if B <= 1, no Bootstrap is performed) and level specifying the confidence level.

plot

logical specifying if the results should be plotted.

verbose

see pnvmix().

control

see get_set_param().

digits

integer specifying the number of digits of the test statistic and the p-value to be displayed.

plot.pars

list specifying plotting parameters such as logarithmic axes; see get_set_qqplot_param().

...

additional arguments (for example, parameters) passed to the underlying mixing distribution when qmix is a character string or function.

Details

If X follows a multivariate normal variance mixture, the distribution of the Mahalanobis distance D^2 = (X-\mu)^T \Sigma^{-1} (X-\mu) is a gamma mixture whose distribution function can be approximated.

The function qqplot_maha() first estimates the theoretical quantiles by calling qgammamix() and then plots those against the empirical squared Mahalanobis distances from the data in x (with \mu=loc and \Sigma=scale). Furthermore, the function computes asymptotic standard errors of the sample quantiles by using an asymptotic normality result for the order statistics which are used to plot the asymptotic CI; see Fox (2008, p. 35 – 36).+

If boot.pars$B > 1 (which is the default), the function additionally performs Bootstrap to construct a CI. Note that by default, the plot contains both the asymptotic and the Bootstrap CI.

Finally, depending on the parameter test, the function performs a univariate GoF test of the observed Mahalanobis distances as described above. The test result (i.e., the value of the statistic along with a p-value) are typically plotted on the second y-axis.

The return object of class "qqplot_maha" contains all computed values (such as p-value, test-statistics, Bootstrap CIs and more). We highlight that storing this return object makes the QQ plot quickly reproducible, as in this case, the theoretical quantiles do not need to be recomputed.

For changing plotting parameters (such as logarithmic axes or colors) via the argument plot.pars, see get_set_qqplot_param().

Value

qqplot_maha() (invisibly) returns an object of the class "qqplot_maha" for which the methods plot() and print() are defined. The return object contains, among others, the components

maha2

Sorted, squared Mahalanobis distances of the data from loc wrt to scale.

theo_quant

The theoretical quantile function evaluated at ppoints(length(maha2)).

boot_CI

(2,length(maha2)) matrix containing the Bootstrap CIs for the empirical quantiles.

asymptSE

vector of length length(maha2) with estimated, asympotic standard errors for the empirical quantiles.

Author(s)

Erik Hintz, Marius Hofert and Christiane Lemieux

References

Hintz, E., Hofert, M. and Lemieux, C. (2021), Normal variance mixtures: Distribution, density and parameter estimation. Computational Statistics and Data Analysis 157C, 107175.

Hintz, E., Hofert, M. and Lemieux, C. (2022), Multivariate Normal Variance Mixtures in R: The R Package nvmix. Journal of Statistical Software, doi:10.18637/jss.v102.i02.

McNeil, A. J., Frey, R. and Embrechts, P. (2015). Quantitative Risk Management: Concepts, Techniques, Tools. Princeton University Press.

See Also

fitnvmix(), get_set_qqplot_param(), rnvmix(), pnvmix(), dnvmix()

Examples

## Sample from a heavy tailed multivariate t and construct QQ plot
set.seed(1)
d <- 2
n <- 1000
df <- 3.1
rho <- 0.5
loc <- rep(0, d)
scale <- matrix(c(1, rho, rho, 1), ncol = 2)
qmix <- "inverse.gamma"
## Sample data
x <- rnvmix(n, qmix = qmix, loc = loc, scale = scale, df = df)
# Construct QQ Plot with 'true' parameters and store result object
qq1 <- qqplot_maha(x, qmix = qmix, df = df, loc = loc, scale = scale)
## ... which is an object of class "qqplot_maha" with two methods
stopifnot(class(qq1) == "qqplot_maha", "plot.qqplot_maha" %in% methods(plot),
          "print.qqplot_maha" %in% methods(print))
plot(qq1) # reproduce the plot
plot(qq1, plotpars = list(log = "xy")) # we can also plot on log-log scale

## In fact, with the 'plotpars' argument, we can change a lot of things
plot(qq1, plotpars = list(col = rep("black", 4), lty = 4:6, pch = "*",
                          plot_test = FALSE, main = "Same with smaller y limits",
                          sub = "MySub", xlab = "MyXlab", ylim = c(0, 1.5e3)))

## What about estimated parameters?
myfit <- fitStudent(x)
## We can conveniently pass 'myfit', rather than specifying 'x', 'loc', ...
set.seed(1)
qq2.1 <- qqplot_maha(fitnvmix_object = myfit, test = "AD", trafo_to_normal = TRUE)
set.seed(1)
qq2.2 <- qqplot_maha(x, qmix = "inverse.gamma", loc = myfit$loc,
                     scale = myfit$scale, df = myfit$df,
                     test = "AD", trafo_to_normal = TRUE)
stopifnot(all.equal(qq2.1$boot_CI, qq2.2$boot_CI)) # check
qq2.2 # it mentions here that the Maha distances were transformed to normal


## Another example where 'qmix' is a function, so quantiles are internally
## estimated via 'qgammamix()'
n <- 15 # small sample size to have examples run fast
## Define the quantile function of an IG(nu/2, nu/2) distribution
qmix <- function(u, df) 1 / qgamma(1 - u, shape = df/2, rate = df/2)
## Sample data
x <- rnvmix(n, qmix = qmix, df = df, loc = loc, scale = scale)
## QQ Plot of empirical quantiles vs true quantiles, all values estimated
## via RQMC:
set.seed(1)
qq3.1 <- qqplot_maha(x, qmix = qmix, loc = loc, scale = scale, df = df)
## Same could be obtained by specifying 'qmix' as string in which case
## qqplot_maha() calls qf()
set.seed(1)
qq3.2 <- qqplot_maha(x, qmix = "inverse.gamma", loc = loc, scale = scale, df = df)

[Package nvmix version 0.1-1 Index]