R: Statistical Hypothesis Tests for Nested Models

GeoTests {GeoModels}

R Documentation

Statistical Hypothesis Tests for Nested Models

Description

The function performs statistical hypothesis tests for nested models based on composite or standard likelihood versions of Wald-type and Wilks-type (likelihood ratio) statistics.

Usage

GeoTests(object1, object2, ..., statistic)

Arguments

`object1`	An object of class `GeoFit`.
`object2`	An object of class `GeoFit` that is a nested model within `object1`.
`...`	Further successively nested objects.
`statistic`	String; the name of the statistic used within the hypothesis test (see Details).

Details

The implemented hypothesis tests for nested models are based on the following statistics:

Wald-type (Wald);
Likelihood ratio or Wilks-type (Wilks under standard likelihood); For composite likelihood available variants of the basic version are:
- Rotnitzky and Jewell adjustment (WilksRJ);
- Satterhwaite adjustment (WilksS);
- Chandler and Bate adjustment (WilksCB);
- Pace, Salvan and Sartori adjustment (WilksPSS);

More specifically, consider an p-dimensional random vector \mathbf{Y} with probability density function f(\mathbf{y};\mathbf{\theta}), where \mathbf{\theta} \in \Theta is a q-dimensional vector of parameters. Suppose that \mathbf{\theta}=(\mathbf{\psi},\mathbf{\tau}) can be partitioned in a q'-dimensional subvector \psi and q''-dimensional subvector \tau. Assume also to be interested in testing the specific values of the vector \psi. Then, one can use some statistical hypothesis tests for testing the null hypothesis H_0: \psi=\psi_0 against the alternative H_1: \psi \neq \psi_0. Composite likelihood versions of 'Wald' statistis have the usual asymptotic chi-square distribution with q' degree of freedom. The Wald-type statistic is

W=(\hat{\psi}-\psi_0)^T (G^{\psi \psi})^{-1}(\hat{\theta})(\hat{\psi}-\psi_0),

where G_{\psi \psi} is the q' \times q' submatrix of the Godambe or Fisher information pertaining to \psi and \hat{\theta} is the maximum likelihood estimator from the full model. This statistic can be called from the routine GeoTests assigning at the argument statistic the value: Wald.

Alternatively to the Wald-type statistic one can use the composite version of the Wilks-type or likelihood ratio statistic, given by

W=2[C \ell(\hat{\mathbf{\theta}};\mathbf{y}) - C \ell\{\mathbf{\psi}_0, \hat{\mathbf{\tau}}(\mathbf{\psi}_0);\mathbf{y}\}].

In the composite likelihood case, the asymptotic distribution of the composite likelihood ratio statistic is given by

W \dot{\sim} \sum_{i} \lambda_i \chi^2,

for i=1,\ldots,q', where \chi^2_i are q' iid copies of a chi-square one random variable and \lambda_1,\ldots,\lambda_{q'} are the eigenvalues of the matrix (H^{\psi \psi})^{-1} G^{\psi \psi}. There exist several adjustments to the composite likelihood ratio statistic in order to get an approximated \chi^2_{q'}. For example, Rotnitzky and Jewell (1990) proposed the adjustment W'= W / \bar{\lambda} where \bar{\lambda} is the average of the eigenvalues \lambda_i. This statistic can be called within the routine by the value: WilksRJ. A better solution is proposed by Satterhwaite (1946) defining W''=\nu W / (q' \bar{\lambda}), where \nu=(\sum_i \lambda)^2 / \sum_i \lambda^2_i for i=1\ldots,q', is the effective number of the degree of freedom. Note that in this case the distribution of the likelihood ratio statistic is a chi-square random variable with \nu degree of freedom. This statistic can be called from the routine assigning the value: WilksS. For the adjustments suggested by Chandler and Bate (2007) we refere to the article (see References). This versions can be called from the routine assigning respectively the values: WilksCB.

Value

An object of class c("data.frame"). The object contain a table with the results of the tested models. The rows represent the responses for each model and the columns the following results:

`Num.Par`	The number of the model's parameters.
`Diff.Par`	The difference between the number of parameters of the model in the previous row and those in the actual row.
`Df`	The effective number of degree of freedom of the chi-square distribution.
`Chisq`	The observed value of the statistic.
`Pr(>chisq)`	The p-value of the quantile `Chisq` computed using a chi-squared distribution with `Df` degrees of freedom.

Author(s)

Moreno Bevilacqua, moreno.bevilacqua89@gmail.com,https://sites.google.com/view/moreno-bevilacqua/home, Víctor Morales Oñate, victor.morales@uv.cl, https://sites.google.com/site/moralesonatevictor/, Christian", Caamaño-Carrillo, chcaaman@ubiobio.cl,https://www.researchgate.net/profile/Christian-Caamano

References

Chandler, R. E., and Bate, S. (2007). Inference for Clustered Data Using the Independence log-likelihood. Biometrika, 94, 167–183.

Rotnitzky, A. and Jewell, N. P. (1990). Hypothesis Testing of Regression Parameters in Semiparametric Generalized Linear Models for Cluster Correlated Data. Biometrika, 77, 485–497.

Satterthwaite, F. E. (1946). An Approximate Distribution of Estimates of Variance Components. Biometrics Bulletin, 2, 110–114.

Varin, C., Reid, N. and Firth, D. (2011). An Overview of Composite Likelihood Methods. Statistica Sinica, 21, 5–42.

Examples



library(GeoModels)

################################################################
###
### Example 1. Test on the parameter
### of a regression model using conditional composite likelihood
###
###############################################################
set.seed(342)
model="Gaussian" 
# Define the spatial-coordinates of the points:
NN=1500
x = runif(NN, 0, 1)
y = runif(NN, 0, 1)
coords = cbind(x,y)
# Parameters
mean=1; mean1=-1.25;  # regression parameters
nugget=0; sill=1

# matrix covariates
X=cbind(rep(1,nrow(coords)),runif(nrow(coords)))

# model correlation 
corrmodel="Wend0"
power2=4;c_supp=0.15

# simulation
param=list(power2=power2,mean=mean,mean1=mean1,
              sill=sill,scale=c_supp,nugget=nugget)
data = GeoSim(coordx=coords, corrmodel=corrmodel,model=model, param=param,X=X)$data

I=Inf
##### H1: (regressian mean)
fixed=list(nugget=nugget,power2=power2)
start=list(mean=mean,mean1=mean1,scale=c_supp,sill=sill)

lower=list(mean=-I,mean1=-I,scale=0,sill=0)
upper=list(mean=I,mean1=I,scale=I,sill=I)
# Maximum pairwise composite-likelihood fitting of the RF:
fitH1 = GeoFit(data=data,coordx=coords,corrmodel=corrmodel, model=model,
              likelihood="Conditional",type="Pairwise",sensitivity=TRUE,
                   lower=lower,upper=upper,neighb=3,
                   optimizer="nlminb",X=X,
                    start=start,fixed=fixed)

unlist(fitH1$param)

##### H0: (constant mean i.e beta1=0)
fixed=list(power2=power2,nugget=nugget,mean1=0)
start=list(mean=mean,scale=c_supp,sill=sill)
lower0=list(mean=-I,scale=0,sill=0)
upper0=list(mean=I,scale=I,sill=I)
# Maximum pairwise composite-likelihood fitting of the RF:
fitH0 = GeoFit(data=data,coordx=coords,corrmodel=corrmodel, model=model,
            likelihood="Conditional",type="Pairwise",sensitivity=TRUE,
                      lower=lower0,upper=upper0,neighb=3,
                   optimizer="nlminb",X=X,
                    start=start,fixed=fixed)
unlist(fitH0$param)

# not run
##fitH1=GeoVarestbootstrap(fitH1,K=100,optimizer="nlminb",
##                     lower=lower, upper=upper)
##fitH0=GeoVarestbootstrap(fitH0,K=100,optimizer="nlminb",
##                     lower=lower0, upper=upper0)

#  Composite likelihood Wald and ratio statistic statistic tests
#  rejecting null  hypothesis beta1=0
##GeoTests(fitH1, fitH0 ,statistic='Wald')
##GeoTests(fitH1, fitH0 , statistic='WilksS')
##GeoTests(fitH1, fitH0 , statistic='WilksCB')




################################################################
###
### Example 2. Parametric test of Gaussianity
### using SinhAsinh random fields using standard likelihood
###
###############################################################
set.seed(99)
model="SinhAsinh" 
# Define the spatial-coordinates of the points:
NN=200
x = runif(NN, 0, 1)
y = runif(NN, 0, 1)
coords = cbind(x,y)
# Parameters
mean=0; nugget=0; sill=1
### skew and tail parameters
skew=0;tail=1   ## H0 is Gaussianity
# underlying model correlation 
corrmodel="Wend0"
power2=4;c_supp=0.2

# simulation from Gaussian  model (H0)
param=list(power2=power2,skew=skew,tail=tail,
             mean=mean,sill=sill,scale=c_supp,nugget=nugget)
data = GeoSim(coordx=coords, corrmodel=corrmodel,model=model, param=param)$data


##### H1: SinhAsinh model
fixed=list(power2=power2,nugget=nugget,mean=mean)
start=list(scale=c_supp,skew=skew,tail=tail,sill=sill)

lower=list(scale=0,skew=-I, tail=0,sill=0)
upper=list(scale=I,skew= I,tail=I,sill=I)
# Maximum pairwise composite-likelihood fitting of the RF:
fitH1 = GeoFit2(data=data,coordx=coords,corrmodel=corrmodel, model=model,
                   likelihood="Full",type="Standard",varest=TRUE,
                   lower=lower,upper=upper,
                   optimizer="nlminb",
                    start=start,fixed=fixed)

unlist(fitH1$param)

##### H0: Gaussianity (i.e tail1=1, skew=0 fixed)
fixed=list(power2=power2,nugget=nugget,mean=mean,tail=1,skew=0)
start=list(scale=c_supp,sill=sill)
lower=list(scale=0,sill=0)
upper=list(scale=2,sill=5)
# Maximum pairwise composite-likelihood fitting of the RF:
fitH0 = GeoFit(data=data,coordx=coords,corrmodel=corrmodel, model=model,
                    likelihood="Full",type="Standard",varest=TRUE,
                      lower=lower,upper=upper,
                   optimizer="nlminb",
                    start=start,fixed=fixed)

unlist(fitH0$param)

#  Standard likelihood Wald and ratio statistic statistic tests
#  not rejecting null  hypothesis tail=1,skew=0 (Gaussianity)
GeoTests(fitH1, fitH0,statistic='Wald')
GeoTests(fitH1, fitH0,statistic='Wilks')

[Package GeoModels version 2.0.4 Index]