R: (Approximate) Probabilities of Non-Central Chi-squared...

pnchisqAppr {DPQ}

R Documentation

(Approximate) Probabilities of Non-Central Chi-squared Distribution

Description

Compute (approximate) probabilities for the non-central chi-squared distribution.

The non-central chi-squared distribution with df= n degrees of freedom and non-centrality parameter ncp = \lambda has density

f(x) = f_{n,\lambda}(x) = e^{-\lambda / 2} \sum_{r=0}^\infty \frac{(\lambda/2)^r}{r!}\, f_{n + 2r}(x)

for x \ge 0; for more, see R's help page for pchisq.

R's own historical and current versions, but with more tuning parameters;

Historical relatively simple approximations listed in Johnson, Kotz, and Balakrishnan (1995):

Patnaik(1949)'s approximation to the non-central via central chi-squared. Is also the formula 26.4.27 in Abramowitz & Stegun, p.942. Johnson et al mention that the approximation error is O(1/\sqrt(\lambda)) for \lambda \to \infty.
Pearson(1959) is using 3 moments instead of 2 as Patnaik (to approximate via a central chi-squared), and therefore better than Patnaik for the right tail; further (in Johnson et al.), the approximation error is O(1/\lambda) for \lambda \to \infty.
Abdel-Aty(1954)'s “first approximation” based on Wilson-Hilferty via Gaussian (pnorm) probabilities, is partly wrongly cited in Johnson et al., p.463, eq.(29.61a).
Bol'shev and Kuznetzov (1963) concentrate on the case of small ncp \lambda and provide an “approximation” via central chi-squared with the same degrees of freedom df, but a modified q (‘x’); the approximation has error O(\lambda^3) for \lambda \to 0 and is from Johnson et al., p.465, eq.(29.62) and (29.63).
Sankaran(1959, 1963) proposes several further approximations base on Gaussian probabilities, according to Johnson et al., p.463. pnchisqSankaran_d() implements its formula (29.61d).

pnchisq():: an R implementation of R's own C pnchisq_raw(), but almost only up to Feb.27, 2004, long before the log.p=TRUE addition there, including logspace arithmetic in April 2014, its finish on 2015-09-01. Currently for historical reference only.
pnchisqV():: a Vectorize()d pnchisq.
pnchisqRC():: R's C implementation as of Aug.2019; but with many more options. Currently extreme cases tend to hang on Winbuilder (?)
pnchisqIT:: ....
pnchisqTerms:: ....
pnchisqT93:: pure R implementations of approximations when both q and ncp are large, by Temme(1993), from Johnson et al., p.467, formulas (29.71 a), and (29.71 b), using auxiliary functions pnchisqT93a() and pnchisqT93b() respectively, with adapted formulas for the log.p=TRUE cases.
pnchisq_ss():: ....
ss:: ....
ss2:: ....
ss2.:: ....

Usage

pnchisq          (q, df, ncp = 0, lower.tail = TRUE, 
                  cutOffncp = 80, itSimple = 110, errmax = 1e-12, reltol = 1e-11,
                  maxit = 10* 10000, verbose = 0, xLrg.sigma = 5)
pnchisqV(x, ..., verbose = 0)

pnchisqRC        (q, df, ncp = 0, lower.tail = TRUE, log.p = FALSE,
                  no2nd.call = FALSE,
                  cutOffncp = 80, small.ncp.logspace = small.ncp.logspaceR2015,
                  itSimple = 110, errmax = 1e-12,
                  reltol = 8 * .Machine$double.eps, epsS = reltol/2, maxit = 1e6,
                  verbose = FALSE)
pnchisqAbdelAty  (q, df, ncp = 0, lower.tail = TRUE, log.p = FALSE)
pnchisqBolKuz    (q, df, ncp = 0, lower.tail = TRUE, log.p = FALSE)
pnchisqPatnaik   (q, df, ncp = 0, lower.tail = TRUE, log.p = FALSE)
pnchisqPearson   (q, df, ncp = 0, lower.tail = TRUE, log.p = FALSE)
pnchisqSankaran_d(q, df, ncp = 0, lower.tail = TRUE, log.p = FALSE)
pnchisq_ss       (x, df, ncp = 0, lower.tail = TRUE, log.p = FALSE, i.max = 10000)
pnchisqTerms     (x, df, ncp,     lower.tail = TRUE, i.max = 1000)

pnchisqT93  (q, df, ncp, lower.tail = TRUE, log.p = FALSE, use.a = q > ncp)
pnchisqT93.a(q, df, ncp, lower.tail = TRUE, log.p = FALSE)
pnchisqT93.b(q, df, ncp, lower.tail = TRUE, log.p = FALSE)

ss   (x, df, ncp, i.max = 10000, useLv = !(expMin < -lambda && 1/lambda < expMax))
ss2  (x, df, ncp, i.max = 10000, eps = .Machine$double.eps)
ss2. (q, df, ncp = 0, errmax = 1e-12, reltol = 2 * .Machine$double.eps,
      maxit = 1e+05, eps = reltol, verbose = FALSE)

Arguments

`x`	numeric vector (of ‘quantiles’, i.e., abscissa values).
`q`	number ( ‘quantile’, i.e., abscissa value.)
`df`	degrees of freedom `> 0`, maybe non-integer.
`ncp`	non-centrality parameter `\delta`; ....
`lower.tail`, `log.p`	logical, see, e.g., `pchisq()`.
`i.max`	number of terms in evaluation ...
`use.a`	`logical` vector for Temme `pnchisqT93*()` formulas, indicating to use formula ‘a’ over ‘b’. The default is as recommended in the references, but they did not take into account `log.p = TRUE` situations.
`cutOffncp`	a positive number, the cutoff value for `ncp`...
`itSimple`	...
`errmax`	absolute error tolerance.
`reltol`	convergence tolerance for relative error.
`maxit`	maximal number of iterations.
`xLrg.sigma`	positive number ...
`no2nd.call`	logical indicating if a 2nd call is made to the internal function ....
`small.ncp.logspace`	logical vector or `function`, indicating if the logspace computations for “small” `ncp` (defined to fulfill `ncp < cutOffncp` !).
`epsS`	small positive number, the convergence tolerance of the ‘simple’ iterations...
`verbose`	logical or integer specifying if or how much the algorithm progress should be monitored.
`...`	further arguments passed from `pnchisqV()` to `pnchisq()`.
`useLv`	`logical` indicating if logarithmic scale should be used for `\lambda` computations.
`eps`	convergence tolerance, a positive number.

Details

pnchisq_ss(): uses si <- ss(x, df, ..) to get the series terms, and returns 2*dchisq(x, df = df +2) * sum(si$s).
ss(): computes the terms needed for the expansion used in pnchisq_ss().
ss2(): computes some simple “statistics” about ss(..).

Value

ss(): returns a list with 3 components

`s`	the series
`i1`	location (in `s[]`) of the first change from 0 to positive.
`max`	(first) location of the maximal value in the series (i.e., `which.max(s)`).

Author(s)

Martin Maechler, from May 1999; starting from a post to the S-news mailing list by Ranjan Maitra (@ math.umbc.edu) who showed a version of our pchisqAppr.0() thanking Jim Stapleton for providing it.

References

Johnson, N.L., Kotz, S. and Balakrishnan, N. (1995) Continuous Univariate Distributions Vol 2, 2nd ed.; Wiley; chapter 29 Noncentral \chi^2-Distributions; notably Section 8 Approximations, p.461 ff.

Abramowitz, M. and Stegun, I. A. (1972) Handbook of Mathematical Functions. New York: Dover. https://en.wikipedia.org/wiki/Abramowitz_and_Stegun

Examples

## set of quantiles to use :
qq <- c(.001, .005, .01, .05, (1:9)/10, 2^seq(0, 10, by= 0.5))
## Take "all interesting" pchisq-approximation from our pkg :
pkg <- "package:DPQ"
pnchNms <- c(paste0("pchisq", c("V", "W", "W.", "W.R")),
             ls(pkg, pattern = "^pnchisq"))
pnchNms <- pnchNms[!grepl("Terms$", pnchNms)]
pnchF <- sapply(pnchNms, get, envir = as.environment(pkg))
str(pnchF)
ncps <- c(0, 1/8, 1/2)
pnchR <- as.list(setNames(ncps, paste("ncp",ncps, sep="=")))
for(i.n in seq_along(ncps)) {
  ncp <- ncps[i.n]
  pnF <- if(ncp == 0) pnchF[!grepl("chisqT93", pnchNms)] else pnchF
  pnchR[[i.n]] <- sapply(pnF, function(F)
            Vectorize(F, names(formals(F))[[1]])(qq, df = 3, ncp=ncp))
}
str(pnchR, max=2)
		 

## A case where the non-central P[] should be improved :
## First, the central P[] which is close to exact -- choosing df=2 allows
## truly exact values: chi^2 = Exp(1) !
opal <- palette()
palette(c("black", "red", "green3", "blue", "cyan", "magenta", "gold3", "gray44"))
cR  <- curve(pchisq   (x, df=2,        lower.tail=FALSE, log.p=TRUE), 0, 4000, n=2001)
cRC <- curve(pnchisqRC(x, df=2, ncp=0, lower.tail=FALSE, log.p=TRUE),
             add=TRUE, col=adjustcolor(2,1/2), lwd=3, lty=2, n=2001)
cR0 <- curve(pchisq   (x, df=2, ncp=0, lower.tail=FALSE, log.p=TRUE),
             add=TRUE, col=adjustcolor(3,1/2), lwd=4,        n=2001)
## smart "named list" constructur :
list_ <- function(...)
   `names<-`(list(...), vapply(sys.call()[-1L], as.character, ""))
JKBfn <-list_(pnchisqPatnaik,
              pnchisqPearson,
              pnchisqAbdelAty,
              pnchisqBolKuz,
              pnchisqSankaran_d)
cl. <- setNames(adjustcolor(3+seq_along(JKBfn), 1/2), names(JKBfn))
lw. <- setNames(2+seq_along(JKBfn),                   names(JKBfn))
cR.JKB <- sapply(names(JKBfn), function(nmf) {
  curve(JKBfn[[nmf]](x, df=2, ncp=0, lower.tail=FALSE, log.p=TRUE),
        add=TRUE, col=cl.[[nmf]], lwd=lw.[[nmf]], lty=lw.[[nmf]], n=2001)
})
legend("bottomleft", c("pchisq", "pchisq.ncp=0", "pnchisqRC", names(JKBfn)),
       col=c(palette()[1], adjustcolor(2:3,1/2), cl.),
       lwd=c(1,3,4, lw.), lty=c(1,2,1, lw.))
palette(opal)# revert

all.equal(cRC, cR0, tol = 1e-15) # TRUE [for now]
## the problematic "jump" :
as.data.frame(cRC)[744:750,]
if(.Platform$OS.type == "unix")
  ## verbose=TRUE  may reveal which branches of the algorithm are taken:
  pnchisqRC(1500, df=2, ncp=0, lower.tail=FALSE, log.p=TRUE, verbose=TRUE) #
  ## |-->  -Inf currently

## The *two*  principal cases (both lower.tail = {TRUE,FALSE} !), where
##  "2nd call"  happens *and* is currently beneficial :
dfs <- c(1:2, 5, 10, 20)
pL. <- pnchisqRC(.00001, df=dfs, ncp=0, log.p=TRUE, lower.tail=FALSE, verbose = TRUE)
pR. <- pnchisqRC(   100, df=dfs, ncp=0, log.p=TRUE,                   verbose = TRUE)
## R's own non-central version (specifying 'ncp'):
pL0 <- pchisq   (.00001, df=dfs, ncp=0, log.p=TRUE, lower.tail=FALSE)
pR0 <- pchisq   (   100, df=dfs, ncp=0, log.p=TRUE)
## R's *central* version, i.e., *not* specifying 'ncp' :
pL  <- pchisq   (.00001, df=dfs,        log.p=TRUE, lower.tail=FALSE)
pR  <- pchisq   (   100, df=dfs,        log.p=TRUE)
cbind(pL., pL, relEc = signif(1-pL./pL, 3), relE0 = signif(1-pL./pL0, 3))
cbind(pR., pR, relEc = signif(1-pR./pR, 3), relE0 = signif(1-pR./pR0, 3))

[Package DPQ version 0.5-8 Index]