R: Apply the Goodness of Fit Test Based on Empirical...

testYourModel {gofedf}

R Documentation

Apply the Goodness of Fit Test Based on Empirical Distribution Function to Any Likelihood Model.

Description

This function applies the goodness-of-fit test based on empirical distribution function. It requires certain inputs depending on whether the model involves parameter estimation or not. If the model is known and there is no parameter estimation, the function requires the sample as a vector and the probability transformed (or pit) values of the sample. This ought to be a vector as well. If there is parameter estimation in the model, the function additionally requires the score as a matrix with n rows and p columns, where n is the sample size and p is the number of estimated parameters. The function checks if the score is zero at the estimated parameter (which is assumed to be the maximum likelihood estimate).

Usage

testYourModel(
  x,
  pit,
  score = NULL,
  ngrid = length(x),
  gridpit = TRUE,
  precision = 1e-09,
  method = "cvm"
)

Arguments

`x`	a non-empty numeric vector of sample data.
`pit`	The probability transformed (or pit) values of the sample which ought to be a numeric vector with the same size as x.
`score`	The default value is null and refers to no parameter estimation case. If there is parameter estimation, the score matrix must be a matrix with n rows and p columns, where n is the sample size and p is the number of estimated parameters.
`ngrid`	the number of equally spaced points to discretize the (0,1) interval for computing the covariance function.
`gridpit`	logical. If `TRUE` (the default value), the parameter ngrid is ignored and (0,1) interval is divided based on probability inverse transformed values obtained from the sample. If `FALSE`, the interval is divided into ngrid equally spaced points for computing the covariance function.
`precision`	The theory behind goodness-of-fit test based on empirical distribution function (edf) works well if the MLE is indeed the root of derivative of log likelihood function. A precision of 1e-9 (default value) is used to check this. A warning message is generated if the score evaluated at MLE is not close enough to zero.
`method`	a character string indicating which goodness-of-fit statistic is to be computed. The default value is 'cvm' for the Cramer-von-Mises statistic. Other options include 'ad' for the Anderson-Darling statistic, and 'both' to compute both cvm and ad.

Value

A list of two containing the following components:

Statistic: the value of goodness-of-fit statistic.
p-value: the approximate p-value for the goodness-of-fit test based on empirical distribution function. if method = 'cvm' or method = 'ad', it returns a numeric value for the statistic and p-value. If method = 'both', it returns a numeric vector with two elements and one for each statistic.

Examples

# Example: Inverse Gaussian (IG) distribution with weights

# Set the seed to reproduce example.
set.seed(123)

# Set the sample size
n <- 50

# Assign weights
weights <- rep(1.5,n)

# Set mean and shape parameters for IG distribution.
mio        <- 2
lambda     <- 2

# Generate a random sample from IG distribution with weighted shape.
sim_data <- statmod::rinvgauss(n, mean = mio, shape = lambda * weights)

# Compute MLE of parameters, score matrix, and pit values.
theta_hat    <- inversegaussianMLE(obs = sim_data,   w = weights)
ScoreMatrix  <- inversegaussianScore(obs = sim_data, w = weights, mle = theta_hat)
pitvalues    <- inversegaussianPIT(obs = sim_data ,  w = weights, mle = theta_hat)

# Apply the goodness-of-fit test.
testYourModel(x = sim_data, pit = pitvalues, score = ScoreMatrix)

[Package gofedf version 0.1.0 Index]