R: Vasicek-Song goodness-of-fit test for various distributions

vs.test {vsgoftest}

R Documentation

Vasicek-Song goodness-of-fit test for various distributions

Description

Performs Vasicek-Song goodness-of-fit test to the specified distribution family.

Usage

vs.test(x, densfun, param = NULL, 
        simulate.p.value = NULL, B = 5000,
        delta = NULL, extend = FALSE, relax = FALSE)

Arguments

`x`	(`numeric, vector`) the numeric sample.
`densfun`	A character string specifying the fitted distribution. Possible values are `"dunif"`, `"dnorm"`, `"dlnorm"`, `"dexp"`, `"dgamma"`, `"dweibull"`, `"dpareto"`, `"df"`, `"dlaplace"` and `"dbeta"`.
`param`	(`numeric, vector`) specifies the parameter(s) of the fitted distribution. If `NULL` (default), a GOF test to the parametric family of distributions specified by densfun is performed.
`simulate.p.value`	(`logical, single value`) if `TRUE`, the p-value of the sample is estimated by means of Monte Carlo methods. If `NULL` (the default), the p-value is simulated if the sample size is smaller than 80; otherwise, an asymptotic p-value is computed.
`B`	(`numeric, single value`) a numeric value specifying the number of simulations to perform in Monte-Carlo estimation of the p-value.
`delta`	(`numeric, single value`) a numeric value smaller than `1/3` specifying the upper bound `n^{1/3}-\delta` for window size, where `n` is the sample size. The default depends on `densfun`; see Vignettes for details.
`extend`	(`logical, single value`). If `FALSE` (the default), the bound for the window is `n^{1/3}-\delta`; if `TRUE`, the bound is `n/2`.
`relax`	(`logical, single value`) avoids the constraint `V_{mn} \leq -\frac{1}{n} \sum_{i=1}^n \log p_0(X_i, \widehat{\theta}_n)` when computing the optimal window; see details. Default is `FALSE`.

Details

The test statistic is

I_{mn}=-V_{mn}-\frac{1}{n}\sum_{i=1}^{n}\log p_{0}(X_{i},\theta),

where V_{mn} is the Vasicek estimator of Shannon entropy computed from the numeric sample x with window size m and p_{0}(x,\theta) is the density function of the specified distribution densfun to be tested, with \theta the parameter of the null for a simple hypothesis or its maximum likelihood estimate for a composite null hypothesis (param=NULL); See Song (2002), Girardin and Lequesne (2017) and Lequesne and Regnault (2018).

An optimal window size m is automatically computed; see Song (2002).

An exact p-value is computed if the sample size is less than 100. Otherwise, asymptotic distribution is used whose approximation may be inaccurate for small samples; see Lequesne and Regnault (2018).

Value

A list with class "htest" containing the following components:

`observed`	The sample under study.
`data.name`	The name (as an R object) of the sample.
`null.value`	A character string specifying the name of the fitted distribution.
`method`	The character string `"Vasicek GOF test to"` followed by the name of the fitted distribution.
`statistic`	Vasicek test statistic; see Details below.
`parameter`	The optimal window for Vasicek test statistic
`estimate`	Parameter(s) of the fitted distribution. If `param` is `NULL`, parameters are estimated. If `param` is suitably filled out by the user, it is returned.
`p.value`	The p-value of the test.

Author(s)

J. Lequesne justine.lequesne@unicaen.fr

References

Vasicek, O., A test for normality based on sample entropy, Journal of the Royal Statistical Society, 38(1), 54-59 (1976).

Song, K. S., Goodness-of-fit tests based on Kullback-Leibler discrimination information, Information Theory, IEEE Transactions on, 48(5), 1103-1117 (2002).

Girardin, V., Lequesne, J. Entropy-based goodness-of-fit tests - a unifying framework. Application to DNA replication. Communications in Statistics: Theory and Methods (2017). https://doi.org/10.1080/03610926.2017.1401084

Lequesne, J., Regnault, P. vsgoftest: An R Package for Goodness-of-Fit Testing Based on Kullback-Leibler Divergence. Journal of Statistical Software, 96 (2020). doi:10.18637/jss.v096.c01

Examples

set.seed(1)
samp <- rnorm(50,2,3)
vs.test(x = samp, densfun = 'dnorm', param = c(2,3), B = 500) #Simple null hypothesis
vs.test(x = samp, densfun='dnorm', B = 500) #Composite null hypothesis
## Using asymptotic distribution to compute the p-value
vs.test(x = samp, densfun='dnorm', simulate.p.value = FALSE) #Composite null hypothesis

[Package vsgoftest version 1.0-1 Index]