eicm {eicm}R Documentation

Fit and select an Explicit Interaction Community Model (EICM)

Description

Given species occurrence data and (optionally) measured environmental predictors, fits and selects an EICM that models species occurrence probability as a function of measured predictors, unmeasured predictors (latent variables) and direct species interactions.

Usage

eicm(
  occurrences,
  env = NULL,
  traits = NULL,
  intercept = TRUE,
  n.latent = 0,
  rotate.latents = FALSE,
  scale.latents = TRUE,
  forbidden = NULL,
  allowed = NULL,
  mask.sp = NULL,
  exclude.prevalence = 0,
  regularization = c(ifelse(n.latent > 0, 6, 0.5), 1),
  regularization.type = "hybrid",
  penalty = 4,
  theta.threshold = 0.5,
  latent.lambda = 1,
  fit.all.with.latents = TRUE,
  popsize.sel = 2,
  n.cores = parallel::detectCores(),
  parallel = FALSE,
  true.model = NULL,
  do.selection = TRUE,
  do.plots = TRUE,
  fast = FALSE,
  refit.selected = TRUE
)

Arguments

occurrences

a binary (0/1) sample x species matrix, possibly including NAs.

env

an optional sample x environmental variable matrix, for the known environmental predictors.

traits

an optional species x trait matrix. Currently, it is only used for excluding species interactions a priori.

intercept

logical specifying whether to add a column for the species-level intercepts.

n.latent

the number of latent variables to estimate.

rotate.latents

logical. Rotate the estimated latent variable values (the values of the latents at each sample) in the first step with PCA? Defaults to FALSE.

scale.latents

logical. Standardize the estimated latent variable values (the values of the latents at each sample) in the first step? Defaults to TRUE.

forbidden

a formula (or list of) defining which species interactions are not to be estimated. See details. This constraint is cumulative with other constraints (mask.sp and exclude.prevalence).

allowed

a formula (or list of) defining which species interactions are to be estimated. See details. This constraint is cumulative with other constraints (mask.sp and exclude.prevalence).

mask.sp

a scalar or a binary square species x species matrix defining which species interactions to exclude (0) or include (1) a priori. If a scalar (0 or 1), 0 excludes all interactions, 1 allows all interactions. If a matrix, species in the columns affect species in the rows, so, setting mask.sp[3, 8] <- 0 means that species #8 is assumed a priori to not affect species #3. This constraint is cumulative with other constraints (forbidden and exclude.prevalence).

exclude.prevalence

exclude species interactions which are caused by species with prevalence equal or lower than this value. This constraint is cumulative with other constraints (forbidden and mask.sp)

regularization

a two-element numeric vector defining the regularization lambdas used for environmental coefficients and for species interactions respectively. See details.

regularization.type

one of "lasso", "ridge" or "hybrid", defining the type of penalty to apply. Type "hybrid" applies ridge penalty to environmental coefficients and LASSO to interaction coefficients.

penalty

the penalty applied to the number of species interactions to include, during variable selection.

theta.threshold

exclude species interactions (from network selection) whose preliminary coefficient (in absolute value) is lower than this value. This exclusion criterion is cumulative with the other user-defined exclusions.

latent.lambda

the regularization applied to latent variables and respective coefficients when estimating their values in samples.

fit.all.with.latents

logical. Whether to use the previously estimated latent variables when estimating the preliminary species interactions.

popsize.sel

the population size for the genetic algorithm, expressed as the factor to multiply by the recommended minimum. Ignored if do.selection=FALSE.

n.cores

the number of CPU cores to use in the variable selection stage and in the optimization.

parallel

logical. Whether to use optimParallel during optimizations instead of optim.

true.model

for validation purposes only: the true model that has generated the data, to which the estimated coefficients will be compared in each selection algorithm iteration.

do.selection

logical. Conduct the variable selection stage, over species interaction network topology?

do.plots

logical. Plot diagnostic and trace plots?

fast

a logical defining whether to do a fast - but less accurate - estimation, or a normal estimation.

refit.selected

logical. Refit with exact estimates the best model after network selection? Note that, for performance reasons, the models fit during the network selection stage use an approximate likelihood.

Details

An Explicit Interaction Community Model (EICM) is a simultaneous equation linear model in which each species model integrates all the other species as predictors, along with measured and latent variables.

This is the main function for fitting EICM models, and is preferred over using eicm.fit directly.

This function conducts the fitting and network topology selection workflow, which includes three stages: 1) estimate latent variable values; 2) make preliminary estimates for species interactions; 3) conduct network topology selection over a reduced model (based on the preliminary estimates).

The selection stage is optional. If not conducted, the species interactions are estimated (all or a subset according to the user-provided constraints), but not selected. See vignette("eicm") for commented examples on a priori excluding interactions.

Missing data in the response matrix is allowed.

Value

A eicm.list with the following components:

true.model:

a copy of the true.model argument.

latents.only:

the model with only the latent variables estimated.

fitted.model

the model with only the species interactions estimated.

selected.model:

the final model with all coefficients estimated, after network topology selection. This is the "best" model given the selection criterion (which depends on regularization and penalty.

When accessing the results, remember to pick the model you want (usually, selected.model). plot automatically picks selected.model or, if NULL, fitted.model.

See Also

eicm-package, eicm.fit, plot.eicm

Examples

# refer to the vignette for a more detailed explanation

# This can take some time to run

# Load the included parameterized model
data(truemodel)

# make one realization of the model
occurrences <- predict(truemodel, nrepetitions=1)

# Fit and select a model with 2 latent variables to be estimated and all
# interactions possible
m <- eicm(occurrences, n.latent=2, penalty=4, theta.threshold=0.5, n.cores=2)

plot(m)


[Package eicm version 1.0.3 Index]