get_LRboot {Infusion} | R Documentation |
Summary likelihood ratio tests
Description
get_LRboot
provides a fast approximation to bootstrap distribution of likelihood ratio statistic.
The bootstrap distribution of the likelihood ratio (LR) statistic may be used to correct the tests based on its asymptotic chi-square distribution. However, the standard bootstrap involves resimulating the data-generating process, given the ML estimates on the original data. This function implements a fast approximation avoiding such simulation, instead drawing from the inferred distribution of (projected, if relevant) summary statistics, again given the maximum (summary-)likelihood estimates.
SLRT
computes likelihood ratio tests based on the summary-likelihood surface and optionally on get_LRboot
results. Several correction of the basic likelihood ratio test may be reported, some more speculative than others.
Usage
SLRT(object, h0, nsim=0L, BGP=NULL, ...)
get_LRboot(object, h0_pars = NULL, nsim = 100L, reset = TRUE, BGP=object$MSL$MSLE, ...)
Arguments
object |
an |
h0 |
Numeric named vector of tested parameter values. |
h0_pars |
either |
nsim |
Integer: number of bootstrap replicates. Values lower than the default are not recommended. Note that this will be ignored if the distribution has previously been simulated and |
reset |
Boolean: Whether to use any previously computed distribution (see Details) or not. |
BGP |
Named numeric vector of “Bootstrap-Generating Parameters”. Ideally the distribution of the LR test statistic would be pivotal and thus the parameter values under which this distribution is simulated would not matter. In practice, simulating by default its distribution under the “best” available information (the MSLE for |
... |
For |
Details
The result of calling get_LRboot
(either directly or through SLRT
) with given h0_pars
is stored in the object
(until the next refine
), and this saved result is returned by a next call to get_LRboot
with the same h0_pars
if reset=FALSE
. The default is however to recompute the distribution (reset=TRUE
).
Parallelization is possible but maybe not useful because computations for each bootstrap replicate are fast relative to parallelization overhead. It will be called when the ... arguments include an nb_cores
>1. The ... may include further arguments passed to dopar
, but among the dopar
arguments, iseed
will be ignored, and fit_env
should not be used.
A raw bootstrap p-value can be computed from the simulated distribution as (1+sum(t >= t0))/(N+1)
where t0
is the original likelihood ratio, t
the vector of bootstrap replicates and N
its length. See Davison & Hinkley (1997, p. 141) for discussion of the adjustments in this formula. However, a sometimes more economical use of the bootstrap is to provide a Bartlett correction for the likelihood ratio test in small samples. According to this correction, the mean value m
of the likelihood ratio statistic under the null hypothesis is computed (here estimated by simulation) and the original LR statistic is multiplied by n/m
where n
is the number of degrees of freedom of the test. Unfortunately, the underlying assumption that the corrected LR statistic follows the chi-square distribution does not always work well.
Value
get_LRboot
returns a numeric vector representing the simulated distribution of the LR statistic, i.e. twice the log-likelihood difference, as directly used in pchisq()
to get the p-value.
SLRT
returns a list with the following element(s), each being a one-row data frame:
basicLRT |
A data frame including values of the likelihood ratio chi2 statistic, its degrees of freedom, and the p-value; |
and, if a bootstrap was performed:
BartBootLRT |
A data frame including values of the Bartlett-corrected likelihood ratio chi2 statistic, its degrees of freedom, and its p-value; |
rawBootLRT |
A data frame including values of the likelihood ratio chi2 statistic, its degrees of freedom, and the raw bootstrap p-value; |
safeBartBootLRT |
equal to |
References
Bartlett, M. S. (1937) Properties of sufficiency and statistical tests. Proceedings of the Royal Society (London) A 160: 268-282.
Davison A.C., Hinkley D.V. (1997) Bootstrap methods and their applications. Cambridge Univ. Press, Cambridge, UK.
Examples
## See help("example_reftable") for SLRT() examples;
## continuing from there, after refine() steps for good results:
# set.seed(123);mean(get_LRboot(slik_j, nsim=500, reset=TRUE)) # close to df=2
# mean(get_LRboot(slik_j, h0_pars = "s2", nsim=500, reset=TRUE)) # close to df=1
## Not run:
### Simulation study of performance of the corrected LRTs:
## Same toy example as in help("example_reftable"):
blurred <- function(mu,s2,sample.size) {
s <- rnorm(n=sample.size,mean=mu,sd=sqrt(s2))
s <- exp(s/4)
return(c(mean=mean(s),var=var(s)))
}
## First build a largish reference table and projections to be used in all replicates
# Only the 600 first rows will be used as initial reference table for each "data"
#
set.seed(123)
#
parsp_j <- data.frame(mu=runif(6000L,min=2.8,max=5.2),
s2=runif(6000L,min=0.4,max=2.4),sample.size=40)
dsimuls <- add_reftable(,Simulate="blurred",par.grid=parsp_j,verbose=FALSE)
#
mufit <- project("mu",stats=c("mean","var"),data=dsimuls,verbose=TRUE)
s2fit <- project("s2",stats=c("mean","var"),data=dsimuls,verbose=TRUE)
dprojectors <- list(MEAN=mufit,VAR=s2fit)
dprojSimuls <- project(dsimuls,projectors=dprojectors,verbose=FALSE)
## Function for single-data analysis:
#
foo <- function(y, refine_maxit=0L, verbose=FALSE) {
dSobs <- blurred(mu=4,s2=1,sample.size=40)
## ----Inference workflow-----------------------------------------------
dprojSobs <- project(dSobs,projectors=dprojectors)
dslik <- infer_SLik_joint(dprojSimuls[1:600,],stat.obs=dprojSobs,verbose=FALSE)
dslik <- MSL(dslik, verbose=verbose, eval_RMSEs=FALSE)
if (refine_maxit) dslik <- refine(dslik, maxit=refine_maxit)
## ---- LRT-----------------------------------------------
lrt <- SLRT(dslik, h0=c(s2=1), nsim=200)
c(basic=lrt$basicLRT$p_value,raw=lrt$rawBootLRT$p_value,
bart=lrt$BartBootLRT$p_value,safe=lrt$safeBartBootLRT$p_value)
}
## Simulations using convenient parallelization interface:
#
# library(doSNOW) # optional
#
bootreps <- spaMM::dopar(matrix(1,ncol=200,nrow=1), # 200 replicates of foo()
fn=foo, fit_env=list(blurred=blurred, dprojectors=dprojectors, dprojSimuls=dprojSimuls),
control=list(.errorhandling = "pass", .packages = "Infusion"),
refine_maxit=5L,
nb_cores=parallel::detectCores()-1L, iseed=123)
#
plot(ecdf(bootreps["basic",]))
abline(0,1)
plot(ecdf(bootreps["bart",]), add=TRUE, col="blue")
plot(ecdf(bootreps["safe",]), add=TRUE, col="red")
plot(ecdf(bootreps["raw",]), add=TRUE, col="green")
#
# Note that refine() iterations are important for good performance.
# Without them, even a larger reftable of 60000 lines
# may exhibit poor results for some of the CI types.
## End(Not run)