CDplot {LPsmooth} | R Documentation |
CD-plot and Deviance test
Description
Constructs the CD-plot and computes the deviance test for exhaustive goodness-of-fit.
Usage
CDplot(data,m=4,g,par0=NULL,range=NULL,lattice=NULL,selection=TRUE,criterion="BIC",
B=1000,samplerG=NULL,h=NULL,samplerH=NULL,R=500,ylim=c(0,2),CD.plot=TRUE)
Arguments
data |
A data vector. See details. |
m |
If |
g |
Function corresponding to the parametric start. See details. |
par0 |
A vector of starting values for the parameters of |
range |
Interval corresponding to the support of the continuous data distribution. |
lattice |
Support of the discrete data distribution. |
selection |
A logical argument indicating if model selection should be performed. See details. |
criterion |
If |
B |
A positive integer corresponding to the number of bootstrap replicates. |
samplerG |
A function corresponding to the random sampler for the parametric start |
h |
Instrumental probability function. If |
samplerH |
A function corresponding to the random sampler for the instrumental probability function |
R |
A positive integer corresponding to the size of the grid of equidistant points at which the comparison densities are evaluated. The default is |
ylim |
If |
CD.plot |
A logical argument indicating if the comparison density plot should be displayed or not. The default is |
Details
The argument data
collects the data for which we want to test if its distribution corresponds to the one of the postulated model specified in the argument g
.
If the parametric start is fully known, it must be specified in a way that it takes x
as the only argument. If the parametric start is not fully known, it must be specified in a way that it takes arguments x
and par
, with par
corresponding to the vector of unknown parameters. The latter are estimated numerically via maximum likelihood estimation and par0
specifies the initial values of the parameters to be used in the optimization.
The value m
determines the smoothness of the estimated comparison density, with smaller values of m
leading to smoother estimates.
If selection=TRUE
, the largest coefficient estimates are selected according to either the AIC or BIC criterion as described in Algeri and Zhang, 2020 (see also Ledwina, 1994 and Mukhopadhyay, 2017). The resulting estimator is the one in Gajek's formulation with orthonormal basis corresponding to LP score functions (see Algeri and Zhang, 2020 and Gajek, 1986).
Value
Deviance |
Value of the deviance test statistic. |
p_value |
P-value of the deviance test. |
Author(s)
Sara Algeri and Xiangyu Zhang
References
Algeri S. and Zhang X. (2020). Exhaustive goodness-of-fit via smoothed inference and graphics. arXiv:2005.13011.
Gajek, L. (1986). On improving density estimators which are not bona fide functions. The Annals of sStatistics, 14(4):1612–1618.
Ledwina, T. (1994). Data-driven version of neymany's smooth test of fit. Journal of the American Statistical Association, 89(427):1000–1005.
Mukhopadhyay, S. (2017). Large-scale mode identification and data-driven sciences. Electronic Journal of Statistics 11 (2017), no. 1, 215–240.
See Also
d_hat
, find_h_disc
, find_h_cont
.
Examples
data<-rbinom(50,size=20,prob=0.5)
g<-function(x)dpois(x,10)/(ppois(20,10)-ppois(0,10))
samplerG<-function(n){xx<-rpois(n*3,10)
xxx<-sample(xx[xx<=20],n)
return(xxx)}
CDplot(data,m=4,g,par0=NULL,range=NULL,lattice=seq(0,20),
selection=FALSE,criterion="BIC",B=10,samplerG,R=300,ylim=c(0,2))