infer_logLs {Infusion} | R Documentation |
Infer log Likelihoods using simulated distributions of summary statistics
Description
For each simulated distribution of summary statistics, infer_logLs
infers a probability density function, and the density of the observed values of the summary statistics is deduced. By default, inference of each density is performed by infer_logL_by_Rmixmod
, which fits a distribution of summary statistics using procedures from the Rmixmod
package.
Usage
infer_logLs(object, stat.obs,
logLname = Infusion.getOption("logLname"),
verbose = list(most=interactive(),
final=FALSE),
method = Infusion.getOption("mixturing"),
nb_cores = NULL, packages = NULL, cluster_args,
...)
infer_tailp(object, refDensity, stat.obs,
tailNames=Infusion.getOption("tailNames"),
verbose=interactive(), method=NULL, cluster_args, ...)
infer_logL_by_GLMM(EDF,stat.obs,logLname,verbose)
infer_logL_by_Rmixmod(EDF,stat.obs,logLname,verbose)
infer_logL_by_mclust(EDF,stat.obs,logLname,verbose)
infer_logL_by_Hlscv.diag(EDF,stat.obs,logLname,verbose)
Arguments
object |
A list of simulated distributions (the return object of |
EDF |
An empirical distribution, with a required |
stat.obs |
Named numeric vector of observed values of summary statistics. |
logLname |
The name to be given to the log Likelihood in the return object, or the root of the latter name in case of conflict with other names in this object. |
tailNames |
Names of “positives” and “negatives” in the binomial response for the inference of tail probabilities. |
refDensity |
An object representing a reference density (such as an |
verbose |
A list as shown by the default, or simply a vector of booleans, indicating respectively
whether to display (1) some information about progress; (2) a final summary of the results after all elements of |
method |
A function for density estimation. See Description for the default behaviour and Details for the constraints on input and output of the function. |
nb_cores |
Number of cores for parallel computation. The default is |
cluster_args |
A list of arguments, passed to |
packages |
For parallel evaluation: Names of additional libraries to be loaded on the cores, necessary for evaluation of a user-defined 'method'. |
... |
further arguments passed to or from other methods (currently not used). |
Details
By default, density estimation is based on Rmixmod
methods. Other available methods are not routinely used and not all of Infusion
features may work with them. The function Rmixmod::mixmodCluster
is called, with arguments nbCluster=seq_nbCluster(nr=nrow(data))
and mixmodGaussianModel=Infusion.getOption("mixmodGaussianModel")
. If Infusion.getOption("seq_nbCluster")
specifies a sequence of values, then several clusterings are computed and AIC is used to select among them.
infer_logL_by_GLMM
, infer_logL_by_Rmixmod
, infer_logL_by_mclust
, and infer_logL_by_Hlscv.diag
are examples of the method that may be provided for density estimation. Other method
s may be provided with the same arguments. Their return value must include the element logL
, an estimate of the log-density of stat.obs
, and the element isValid
with values FALSE
/TRUE
(or 0/1). The standard format for the return value is unlist(c(attr(EDF,"par"),logL,isValid=isValid))
.
isValid
is primarily intended to indicate whether the log likelihood of stat.obs
inferred by a given density estimation method was suitable input for inference of the likelihood surface. isValid
has two effects: to distinguish points for which isValid is FALSE in the plot produced by plot.SLik
; and more critically, to control the sampling of new parameter points within refine
so that points for which isValid is FALSE are less likely to be sampled.
Invalid values may for example indicate a likelihood estimated as zero (since log(0) is not suitable input), or (for density estimation methods which may infer erroneously large values when extrapolating), whether stat.obs
is within the convex hull of the EDF. In user-defined method
s, invalid inferred logL should be replaced by some alternative low estimate, as all methods included in the package do.
The source code of infer_logL_by_Hlscv.diag
illustrates how to test whether stat.obs
is within the convex hull of the EDF, using functions resetCHull
and isPointInCHull
(exported from the blackbox
package).
infer_logL_by_Rmixmod
calls Rmixmod::mixmodCluster
infer_logL_by_mclust
calls mclust::densityMclust
,
infer_logL_by_Hlscv.diag
calls ks::kde
, and infer_logL_by_GLMM
fits a binned distribution of summary statistics using a Poisson GLMM with autocorrelated random effects, where the binning is based on a tesselation of a volume containing the whole simulated distribution. Limited experiments so far suggest that the mixture models methods are fast and appropriate (Rmixmod
, being a bit faster, is the default method); that the kernel smoothing method is more erratic and moreover requires additional input from the user, hence is not really applicable, for distributions in dimension d= 4 or above; and that the GLMM method is a very good density estimator for d=2 but will challenge one's patience for d=3 and further challenge the computer's memory for d=4.
Value
For infer_logLs
, a data frame containing parameter values and their log likelihoods, and additional information such as attributes providing information about the parameter names and statistics names (not detailed here). These attributes are essential for further inferences.
See Details for the required value of the method
s called by infer_logLs
.
See Also
See step (3) of the workflow in the Example on the main Infusion
documentation page.