R: Log-likelihood for a model fitted with no latent variables

calc.logLik.lv0 {boral}

R Documentation

Log-likelihood for a model fitted with no latent variables

Description

Calculates the log-likelihood for a set of parameter estimates from a model with no latent variables. If the row effects are assumed to be random, they are integrated over using Monte Carlo integration. WARNING: As of version 1.9, this function is no longer being maintained (and probably does not work properly, if at all)!

Usage

calc.logLik.lv0(y, X = NULL, family, trial.size = 1, lv.coefs, 
	X.coefs = NULL, row.eff = "none", row.params = NULL, 
	row.ids = NULL, offset = NULL, cutoffs = NULL,
	powerparam = NULL)

Arguments

`y`	The response matrix the model was fitted to.
`X`	The covariate matrix used in the model. Defaults to `NULL`, in which case it is assumed no model matrix was used.
`family`	Either a single element, or a vector of length equal to the number of columns in the response matrix. The former assumes all columns of the response matrix come from this distribution. The latter option allows for different distributions for each column of the response matrix. Elements can be one of "binomial" (with probit link), "poisson" (with log link), "negative.binomial" (with log link), "normal" (with identity link), "lnormal" for lognormal (with log link), "tweedie" (with log link), "exponential" (with log link), "gamma" (with log link), "beta" (with logit link), "ordinal" (cumulative probit regression). Please see `about.distributions` for information on distributions available in boral overall.
`trial.size`	Either equal to a single element, or a vector of length equal to the number of columns in y. If a single element, then all columns assumed to be binomially distributed will have trial size set to this. If a vector, different trial sizes are allowed in each column of y. The argument is ignored for all columns not assumed to be binomially distributed. Defaults to 1, i.e. Bernoulli distribution.
`lv.coefs`	The response-specific intercept, coefficient estimates relating to the latent variables, and dispersion parameters from the fitted model.
`X.coefs`	The coefficients estimates relating to the covariate matrix from the fitted model. Defaults to `NULL`, in which it is assumed there are no covariates in the model.
`row.eff`	Single element indicating whether row effects are included as fixed effects ("fixed"), random effects ("random") or not included ("none") in the fitted model. If fixed effects, then for parameter identifiability the first row effect is set to zero, which analogous to acting as a reference level when dummy variables are used. If random effects, they are drawn from a normal distribution with mean zero and standard deviation given by `row.params`. Defaults to "none".
`row.params`	Parameters corresponding to the row effect from the fitted model. If `row.eff = "fixed"`, then these are the fixed effects and should have length equal to the number of columns in the response matrix. If `row.eff = "random"`, then this is the standard deviation for the random effects normal distribution. If `row.eff = "none"`, then this argument is ignored.
`row.ids`	A matrix with the number of rows equal to the number of rows in the response matrix, and the number of columns equal to the number of row effects to be included in the model. Element `(i,j)` indicates the cluster ID of row `i` in the response matrix for random effect `j`; please see `boral` for details. Defaults to `NULL`, so that if `row.params = NULL` then the argument is ignored, otherwise if `row.params` is supplied then `row.ids = matrix(1:nrow(y), ncol = 1)` i.e., a single, row effect unique to each row. An internal check is done to see `row.params` and `row.ids` are consistent in terms of arguments supplied.
`offset`	A matrix with the same dimensions as the response matrix, specifying an a-priori known component to be included in the linear predictor during fitting. Defaults to `NULL`.
`cutoffs`	Common cutoff estimates from the fitted model when any of the columns of the response matrix are ordinal responses. Defaults to `NULL`.
`powerparam`	Common power parameter from the fitted model when any of the columns of the response matrix are tweedie responses. Defaults to `NULL`.

Details

For an n x p response matrix \bm{y}, the log-likelihood for a model with no latent variables included is given by,

\log(f) = \sum_{i=1}^n \sum_{j=1}^p \log \{f(y_{ij} | \beta_{0j}, \alpha_i, \ldots)\},

where f(y_{ij}|\cdot) is the assumed distribution for column j, \beta_{0j} is the response-specific intercepts, \alpha_i is the row effect, and \ldots generically denotes anything else included in the model, e.g. row effects, dispersion parameters etc...

Please note the function is written conditional on all regression coefficients. Therefore, if traits are included in the model, in which case the regression coefficients \beta_{0j}, \bm{\beta}_j become random effects instead (please see about.traits), then the calculation of the log-likelihood does NOT take this into account, i.e. does not marginalize over them!

Likewise if more than two columns are ordinal responses, then the regression coefficients \beta_{0j} corresponding to these columns become random effects, and the calculation of the log-likelihood also does NOT take this into account, i.e. does not marginalize over them!

When a single \alpha_i random row effect is inclued, then the log-likelihood is calculated by integrating over this,

\log(f) = \sum_{i=1}^n \log ( \int \prod_{j=1}^p \{f(y_{ij} | \beta_{0j}, \alpha_i, \ldots)\}f(\alpha_i) d\alpha_i ),

where f(\alpha_i) is the random effects distribution with mean zero and standard deviation given by the row.params. The integration is performed using standard Monte Carlo integration. This naturally extends to multiple random row effects structures.

Value

A list with the following components:

`logLik`	Value of the log-likelihood
`logLik.comp`	A vector of the log-likelihood values for each row of the response matrix, such that `sum(logLik.comp) = logLik`.

Author(s)

Francis K.C. Hui [aut, cre], Wade Blanchard [aut]

Maintainer: Francis K.C. Hui <fhui28@gmail.com>

Examples

## Not run: 
## NOTE: The values below MUST NOT be used in a real application;
## they are only used here to make the examples run quick!!!
example_mcmc_control <- list(n.burnin = 10, n.iteration = 100, 
     n.thin = 1)

testpath <- file.path(tempdir(), "jagsboralmodel.txt")


library(mvabund) ## Load a dataset from the mvabund package
data(spider)
y <- spider$abun
n <- nrow(y)
p <- ncol(y)

## Example 1 - NULL model with site effects only
spiderfit_nb <- boral(y, family = "negative.binomial", row.eff = "fixed", 
    save.model = TRUE, mcmc.control = example_mcmc_control,
    model.name = testpath)

## Extract all MCMC samples
fit_mcmc <- get.mcmcsamples(spiderfit_nb) 
mcmc_names <- colnames(fit_mcmc)

## Find the posterior medians
coef_mat <- matrix(apply(fit_mcmc[,grep("lv.coefs",mcmc_names)],
    2,median),nrow=p)
site_coef <- list(ID1 = apply(fit_mcmc[,grep("row.coefs.ID1", mcmc_names)],
    2,median))

## Calculate the log-likelihood at the posterior median
calc.logLik.lv0(y, family = "negative.binomial",
    lv.coefs =  coef_mat, row.eff = "fixed", row.params = site_coef)


## Example 2 - Model with environmental covariates and random row effects
X <- scale(spider$x)
spiderfit_nb2 <- boral(y, X = X, family = "negative.binomial", row.eff = "random",
    save.model = TRUE, mcmc.control = example_mcmc_control, 
    model.name = testpath)

## Extract all MCMC samples
fit_mcmc <- get.mcmcsamples(spiderfit_nb2) 
mcmc_names <- colnames(fit_mcmc)

## Find the posterior medians
coef_mat <- matrix(apply(fit_mcmc[,grep("lv.coefs",mcmc_names)],
    2,median),nrow=p)
X_coef_mat <- matrix(apply(fit_mcmc[,grep("X.coefs",mcmc_names)],
    2,median),nrow=p)
site.sigma <- list(ID1 = 
    median(fit_mcmc[,grep("row.sigma.ID1", mcmc_names)]))

	
## Calculate the log-likelihood at the posterior median
calc.logLik.lv0(y, X = spider$x, family = "negative.binomial", 
    row.eff = "random",lv.coefs =  coef_mat, X.coefs = X_coef_mat, 
    row.params = site.sigma)

## End(Not run)

[Package boral version 2.0.2 Index]