eicm {eicm} | R Documentation |
Fit and select an Explicit Interaction Community Model (EICM)
Description
Given species occurrence data and (optionally) measured environmental predictors, fits and selects an EICM that models species occurrence probability as a function of measured predictors, unmeasured predictors (latent variables) and direct species interactions.
Usage
eicm(
occurrences,
env = NULL,
traits = NULL,
intercept = TRUE,
n.latent = 0,
rotate.latents = FALSE,
scale.latents = TRUE,
forbidden = NULL,
allowed = NULL,
mask.sp = NULL,
exclude.prevalence = 0,
regularization = c(ifelse(n.latent > 0, 6, 0.5), 1),
regularization.type = "hybrid",
penalty = 4,
theta.threshold = 0.5,
latent.lambda = 1,
fit.all.with.latents = TRUE,
popsize.sel = 2,
n.cores = parallel::detectCores(),
parallel = FALSE,
true.model = NULL,
do.selection = TRUE,
do.plots = TRUE,
fast = FALSE,
refit.selected = TRUE
)
Arguments
occurrences |
a binary (0/1) sample x species matrix, possibly including NAs. |
env |
an optional sample x environmental variable matrix, for the known environmental predictors. |
traits |
an optional species x trait matrix. Currently, it is only used for excluding species interactions a priori. |
intercept |
logical specifying whether to add a column for the species-level intercepts. |
n.latent |
the number of latent variables to estimate. |
rotate.latents |
logical. Rotate the estimated latent variable values (the values of the latents at each sample) in the first step with PCA? Defaults to FALSE. |
scale.latents |
logical. Standardize the estimated latent variable values (the values of the latents at each sample) in the first step? Defaults to TRUE. |
forbidden |
a formula (or list of) defining which species interactions are not to be estimated. See details.
This constraint is cumulative with other constraints ( |
allowed |
a formula (or list of) defining which species interactions are to be estimated. See details.
This constraint is cumulative with other constraints ( |
mask.sp |
a scalar or a binary square species x species matrix defining which species interactions to exclude
(0) or include (1) a priori. If a scalar (0 or 1), 0 excludes all interactions, 1 allows all interactions.
If a matrix, species in the columns affect species in the rows, so, setting |
exclude.prevalence |
exclude species interactions which are caused by species
with prevalence equal or lower than this value. This constraint is cumulative with
other constraints ( |
regularization |
a two-element numeric vector defining the regularization lambdas used for environmental coefficients and for species interactions respectively. See details. |
regularization.type |
one of "lasso", "ridge" or "hybrid", defining the type of penalty to apply. Type "hybrid" applies ridge penalty to environmental coefficients and LASSO to interaction coefficients. |
penalty |
the penalty applied to the number of species interactions to include, during variable selection. |
theta.threshold |
exclude species interactions (from network selection) whose preliminary coefficient (in absolute value) is lower than this value. This exclusion criterion is cumulative with the other user-defined exclusions. |
latent.lambda |
the regularization applied to latent variables and respective coefficients when estimating their values in samples. |
fit.all.with.latents |
logical. Whether to use the previously estimated latent variables when estimating the preliminary species interactions. |
popsize.sel |
the population size for the genetic algorithm, expressed as the factor to multiply
by the recommended minimum. Ignored if |
n.cores |
the number of CPU cores to use in the variable selection stage and in the optimization. |
parallel |
logical. Whether to use |
true.model |
for validation purposes only: the true model that has generated the data, to which the estimated coefficients will be compared in each selection algorithm iteration. |
do.selection |
logical. Conduct the variable selection stage, over species interaction network topology? |
do.plots |
logical. Plot diagnostic and trace plots? |
fast |
a logical defining whether to do a fast - but less accurate - estimation, or a normal estimation. |
refit.selected |
logical. Refit with exact estimates the best model after network selection? Note that, for performance reasons, the models fit during the network selection stage use an approximate likelihood. |
Details
An Explicit Interaction Community Model (EICM) is a simultaneous equation linear model in which each species model integrates all the other species as predictors, along with measured and latent variables.
This is the main function for fitting EICM models, and is preferred over using eicm.fit
directly.
This function conducts the fitting and network topology selection workflow, which includes three stages: 1) estimate latent variable values; 2) make preliminary estimates for species interactions; 3) conduct network topology selection over a reduced model (based on the preliminary estimates).
The selection stage is optional. If not conducted, the species interactions are estimated
(all or a subset according to the user-provided constraints), but not selected.
See vignette("eicm")
for commented examples on a priori excluding interactions.
Missing data in the response matrix is allowed.
Value
A eicm.list
with the following components:
- true.model:
a copy of the
true.model
argument.- latents.only:
the model with only the latent variables estimated.
- fitted.model
the model with only the species interactions estimated.
- selected.model:
the final model with all coefficients estimated, after network topology selection. This is the "best" model given the selection criterion (which depends on
regularization
andpenalty
.
When accessing the results, remember to pick the model you want (usually, selected.model
).
plot
automatically picks selected.model
or, if NULL, fitted.model
.
See Also
eicm-package
, eicm.fit
, plot.eicm
Examples
# refer to the vignette for a more detailed explanation
# This can take some time to run
# Load the included parameterized model
data(truemodel)
# make one realization of the model
occurrences <- predict(truemodel, nrepetitions=1)
# Fit and select a model with 2 latent variables to be estimated and all
# interactions possible
m <- eicm(occurrences, n.latent=2, penalty=4, theta.threshold=0.5, n.cores=2)
plot(m)