lik {arf} | R Documentation |
Likelihood Estimation
Description
Compute the likelihood of input data, optionally conditioned on some event(s).
Usage
lik(
params,
query,
evidence = NULL,
arf = NULL,
oob = FALSE,
log = TRUE,
batch = NULL,
parallel = TRUE
)
Arguments
params |
Circuit parameters learned via |
query |
Data frame of samples, optionally comprising just a subset of training features. Likelihoods will be computed for each sample. Missing features will be marginalized out. See Details. |
evidence |
Optional set of conditioning events. This can take one of three forms: (1) a partial sample, i.e. a single row of data with some but not all columns; (2) a data frame of conditioning events, which allows for inequalities; or (3) a posterior distribution over leaves. See Details. |
arf |
Pre-trained |
oob |
Only use out-of-bag leaves for likelihood estimation? If
|
log |
Return likelihoods on log scale? Recommended to prevent underflow. |
batch |
Batch size. The default is to compute densities for all of queries in one round, which is always the fastest option if memory allows. However, with large samples or many trees, it can be more memory efficient to split the data into batches. This has no impact on results. |
parallel |
Compute in parallel? Must register backend beforehand, e.g.
via |
Details
This function computes the likelihood of input data, optionally conditioned on some event(s). Queries may be partial, i.e. covering some but not all features, in which case excluded variables will be marginalized out.
There are three methods for (optionally) encoding conditioning events via the
evidence
argument. The first is to provide a partial sample, where
some but not all columns from the training data are present. The second is to
provide a data frame with three columns: variable
, relation
,
and value
. This supports inequalities via relation
.
Alternatively, users may directly input a pre-calculated posterior
distribution over leaves, with columns f_idx
and wt
. This may
be preferable for complex constraints. See Examples.
Value
A vector of likelihoods, optionally on the log scale.
References
Watson, D., Blesch, K., Kapar, J., & Wright, M. (2023). Adversarial random forests for density estimation and generative modeling. In Proceedings of the 26th International Conference on Artificial Intelligence and Statistics, pp. 5357-5375.
See Also
Examples
# Estimate average log-likelihood
arf <- adversarial_rf(iris)
psi <- forde(arf, iris)
ll <- lik(psi, iris, arf = arf, log = TRUE)
mean(ll)
# Identical but slower
ll <- lik(psi, iris, log = TRUE)
mean(ll)
# Partial evidence query
lik(psi, query = iris[1, 1:3])
# Condition on Species = "setosa"
evi <- data.frame(Species = "setosa")
lik(psi, query = iris[1, 1:3], evidence = evi)
# Condition on Species = "setosa" and Petal.Width > 0.3
evi <- data.frame(variable = c("Species", "Petal.Width"),
relation = c("==", ">"),
value = c("setosa", 0.3))
lik(psi, query = iris[1, 1:3], evidence = evi)