R: Control Function for QRR-VGLMs (CQO)

qrrvglm.control {VGAM}

R Documentation

Control Function for QRR-VGLMs (CQO)

Description

Algorithmic constants and parameters for a constrained quadratic ordination (CQO), by fitting a quadratic reduced-rank vector generalized linear model (QRR-VGLM), are set using this function. It is the control function for cqo.

Usage

qrrvglm.control(Rank = 1, Bestof = if (length(Cinit)) 1 else 10,
    checkwz = TRUE, Cinit = NULL, Crow1positive = TRUE,
    epsilon = 1.0e-06, EqualTolerances = NULL, eq.tolerances = TRUE,
    Etamat.colmax = 10, FastAlgorithm = TRUE, GradientFunction = TRUE,
    Hstep = 0.001, isd.latvar = rep_len(c(2, 1, rep_len(0.5, Rank)),
    Rank), iKvector = 0.1, iShape = 0.1, ITolerances = NULL,
    I.tolerances = FALSE, maxitl = 40, imethod = 1,
    Maxit.optim = 250, MUXfactor = rep_len(7, Rank),
    noRRR = ~ 1, Norrr = NA, optim.maxit = 20,
    Parscale = if (I.tolerances) 0.001 else 1.0,
    sd.Cinit = 0.02, SmallNo = 5.0e-13, trace = TRUE,
    Use.Init.Poisson.QO = TRUE,
    wzepsilon = .Machine$double.eps^0.75, ...)

Arguments

In the following, R is the Rank, M is the number of linear predictors, and S is the number of responses (species). Thus M=S for binomial and Poisson responses, and M=2S for the negative binomial and 2-parameter gamma distributions.

`Rank`	The numerical rank `R` of the model, i.e., the number of ordination axes. Must be an element from the set {1,2,...,min(`M`,`p_2`)} where the vector of explanatory variables `x` is partitioned into (`x_1`,`x_2`), which is of dimension `p_1+p_2`. The variables making up `x_1` are given by the terms in the `noRRR` argument, and the rest of the terms comprise `x_2`.
`Bestof`	Integer. The best of `Bestof` models fitted is returned. This argument helps guard against local solutions by (hopefully) finding the global solution from many fits. The argument has value 1 if an initial value for `C` is inputted using `Cinit`.
`checkwz`	logical indicating whether the diagonal elements of the working weight matrices should be checked whether they are sufficiently positive, i.e., greater than `wzepsilon`. If not, any values less than `wzepsilon` are replaced with this value.
`Cinit`	Optional initial `C` matrix, which must be a `p_2` by `R` matrix. The default is to apply `.Init.Poisson.QO()` to obtain initial values.
`Crow1positive`	Logical vector of length `Rank` (recycled if necessary): are the elements of the first row of `C` positive? For example, if `Rank` is 4, then specifying `Crow1positive = c(FALSE, TRUE)` will force `C[1,1]` and `C[1,3]` to be negative, and `C[1,2]` and `C[1,4]` to be positive. This argument allows for a reflection in the ordination axes because the coefficients of the latent variables are unique up to a sign.
`epsilon`	Positive numeric. Used to test for convergence for GLMs fitted in C. Larger values mean a loosening of the convergence criterion. If an error code of 3 is reported, try increasing this value.
`eq.tolerances`	Logical indicating whether each (quadratic) predictor will have equal tolerances. Having `eq.tolerances = TRUE` can help avoid numerical problems, especially with binary data. Note that the estimated (common) tolerance matrix may or may not be positive-definite. If it is then it can be scaled to the `R` by `R` identity matrix, i.e., made equivalent to `I.tolerances = TRUE`. Setting `I.tolerances = TRUE` will force a common `R` by `R` identity matrix as the tolerance matrix to the data even if it is not appropriate. In general, setting `I.tolerances = TRUE` is preferred over `eq.tolerances = TRUE` because, if it works, it is much faster and uses less memory. However, `I.tolerances = TRUE` requires the environmental variables to be scaled appropriately. See Details for more details.
`EqualTolerances`	Defunct argument. Use `eq.tolerances` instead.
`Etamat.colmax`	Positive integer, no smaller than `Rank`. Controls the amount of memory used by `.Init.Poisson.QO()`. It is the maximum number of columns allowed for the pseudo-response and its weights. In general, the larger the value, the better the initial value. Used only if `Use.Init.Poisson.QO = TRUE`.
`FastAlgorithm`	Logical. Whether a new fast algorithm is to be used. The fast algorithm results in a large speed increases compared to Yee (2004). Some details of the fast algorithm are found in Appendix A of Yee (2006). Setting `FastAlgorithm = FALSE` will give an error.
`GradientFunction`	Logical. Whether `optim`'s argument `gr` is used or not, i.e., to compute gradient values. Used only if `FastAlgorithm` is `TRUE`. The default value is usually faster on most problems.
`Hstep`	Positive value. Used as the step size in the finite difference approximation to the derivatives by `optim`.
`isd.latvar`	Initial standard deviations for the latent variables (site scores). Numeric, positive and of length `R` (recycled if necessary). This argument is used only if `I.tolerances = TRUE`. Used by `.Init.Poisson.QO()` to obtain initial values for the constrained coefficients `C` adjusted to a reasonable value. It adjusts the spread of the site scores relative to a common species tolerance of 1 for each ordination axis. A value between 0.5 and 10 is recommended; a value such as 10 means that the range of the environmental space is very large relative to the niche width of the species. The successive values should decrease because the first ordination axis should have the most spread of site scores, followed by the second ordination axis, etc.
`iKvector`, `iShape`	Numeric, recycled to length `S` if necessary. Initial values used for estimating the positive `k` and `\lambda` parameters of the negative binomial and 2-parameter gamma distributions respectively. For further information see `negbinomial` and `gamma2`. These arguments override the `ik` and `ishape` arguments in `negbinomial` and `gamma2`.
`I.tolerances`	Logical. If `TRUE` then the (common) tolerance matrix is the `R` by `R` identity matrix by definition. Note that having `I.tolerances = TRUE` implies `eq.tolerances = TRUE`, but not vice versa. Internally, the quadratic terms will be treated as offsets (in GLM jargon) and so the models can potentially be fitted very efficiently. However, it is a very good idea to center and scale all numerical variables in the `x_2` vector. See Details for more details. The success of `I.tolerances = TRUE` often depends on suitable values for `isd.latvar` and/or `MUXfactor`.
`ITolerances`	Defunct argument. Use `I.tolerances` instead.
`maxitl`	Maximum number of times the optimizer is called or restarted. Most users should ignore this argument.
`imethod`	Method of initialization. A positive integer 1 or 2 or 3 etc. depending on the VGAM family function. Currently it is used for `negbinomial` and `gamma2` only, and used within the C.
`Maxit.optim`	Positive integer. Number of iterations given to the function `optim` at each of the `optim.maxit` iterations.
`MUXfactor`	Multiplication factor for detecting large offset values. Numeric, positive and of length `R` (recycled if necessary). This argument is used only if `I.tolerances = TRUE`. Offsets are `-0.5` multiplied by the sum of the squares of all `R` latent variable values. If the latent variable values are too large then this will result in numerical problems. By too large, it is meant that the standard deviation of the latent variable values are greater than `MUXfactor[r] * isd.latvar[r]` for `r=1:Rank` (this is why centering and scaling all the numerical predictor variables in `x_2` is recommended). A value about 3 or 4 is recommended. If failure to converge occurs, try a slightly lower value.
`optim.maxit`	Positive integer. Number of times `optim` is invoked. At iteration `i`, the `i`th value of `Maxit.optim` is fed into `optim`.
`noRRR`	Formula giving terms that are not to be included in the reduced-rank regression (or formation of the latent variables), i.e., those belong to `x_1`. Those variables which do not make up the latent variable (reduced-rank regression) correspond to the `B_1` matrix. The default is to omit the intercept term from the latent variables.
`Norrr`	Defunct. Please use `noRRR`. Use of `Norrr` will become an error soon.
`Parscale`	Numerical and positive-valued vector of length `C` (recycled if necessary). Passed into `optim(..., control = list(parscale = Parscale))`; the elements of `C` become `C` / `Parscale`. Setting `I.tolerances = TRUE` results in line searches that are very large, therefore `C` has to be scaled accordingly to avoid large step sizes. See Details for more information. It's probably best to leave this argument alone.
`sd.Cinit`	Standard deviation of the initial values for the elements of `C`. These are normally distributed with mean zero. This argument is used only if `Use.Init.Poisson.QO = FALSE` and `C` is not inputted using `Cinit`.
`trace`	Logical indicating if output should be produced for each iteration. The default is `TRUE` because the calculations are numerically intensive, meaning it may take a long time, so that the user might think the computer has locked up if `trace = FALSE`.
`SmallNo`	Positive numeric between `.Machine$double.eps` and `0.0001`. Used to avoid under- or over-flow in the IRLS algorithm. Used only if `FastAlgorithm` is `TRUE`.
`Use.Init.Poisson.QO`	Logical. If `TRUE` then the function `.Init.Poisson.QO()` is used to obtain initial values for the canonical coefficients `C`. If `FALSE` then random numbers are used instead.
`wzepsilon`	Small positive number used to test whether the diagonals of the working weight matrices are sufficiently positive.
`...`	Ignored at present.

Details

Recall that the central formula for CQO is

\eta = B_1^T x_1 + A \nu + \sum_{m=1}^M (\nu^T D_m \nu) e_m

where x_1 is a vector (usually just a 1 for an intercept), x_2 is a vector of environmental variables, \nu=C^T x_2 is a R-vector of latent variables, e_m is a vector of 0s but with a 1 in the mth position. QRR-VGLMs are an extension of RR-VGLMs and allow for maximum likelihood solutions to constrained quadratic ordination (CQO) models.

Having I.tolerances = TRUE means all the tolerance matrices are the order-R identity matrix, i.e., it forces bell-shaped curves/surfaces on all species. This results in a more difficult optimization problem (especially for 2-parameter models such as the negative binomial and gamma) because of overflow errors and it appears there are more local solutions. To help avoid the overflow errors, scaling C by the factor Parscale can help enormously. Even better, scaling C by specifying isd.latvar is more understandable to humans. If failure to converge occurs, try adjusting Parscale, or better, setting eq.tolerances = TRUE (and hope that the estimated tolerance matrix is positive-definite). To fit an equal-tolerances model, it is firstly best to try setting I.tolerances = TRUE and varying isd.latvar and/or MUXfactor if it fails to converge. If it still fails to converge after many attempts, try setting eq.tolerances = TRUE, however this will usually be a lot slower because it requires a lot more memory.

With a R > 1 model, the latent variables are always uncorrelated, i.e., the variance-covariance matrix of the site scores is a diagonal matrix.

If setting eq.tolerances = TRUE is used and the common estimated tolerance matrix is positive-definite then that model is effectively the same as the I.tolerances = TRUE model (the two are transformations of each other). In general, I.tolerances = TRUE is numerically more unstable and presents a more difficult problem to optimize; the arguments isd.latvar and/or MUXfactor often must be assigned some good value(s) (possibly found by trial and error) in order for convergence to occur. Setting I.tolerances = TRUE forces a bell-shaped curve or surface onto all the species data, therefore this option should be used with deliberation. If unsuitable, the resulting fit may be very misleading. Usually it is a good idea for the user to set eq.tolerances = FALSE to see which species appear to have a bell-shaped curve or surface. Improvements to the fit can often be achieved using transformations, e.g., nitrogen concentration to log nitrogen concentration.

Fitting a CAO model (see cao) first is a good idea for pre-examining the data and checking whether it is appropriate to fit a CQO model.

Value

A list with components matching the input names.

Warning

The default value of Bestof is a bare minimum for many datasets, therefore it will be necessary to increase its value to increase the chances of obtaining the global solution.

Note

When I.tolerances = TRUE it is a good idea to apply scale to all the numerical variables that make up the latent variable, i.e., those of x_2. This is to make them have mean 0, and hence avoid large offset values which cause numerical problems.

This function has many arguments that are common with rrvglm.control and vglm.control.

It is usually a good idea to try fitting a model with I.tolerances = TRUE first, and if convergence is unsuccessful, then try eq.tolerances = TRUE and I.tolerances = FALSE. Ordination diagrams with eq.tolerances = TRUE have a natural interpretation, but with eq.tolerances = FALSE they are more complicated and requires, e.g., contours to be overlaid on the ordination diagram (see lvplot.qrrvglm).

In the example below, an equal-tolerances CQO model is fitted to the hunting spiders data. Because I.tolerances = TRUE, it is a good idea to center all the x_2 variables first. Upon fitting the model, the actual standard deviation of the site scores are computed. Ideally, the isd.latvar argument should have had this value for the best chances of getting good initial values. For comparison, the model is refitted with that value and it should run more faster and reliably.

Author(s)

Thomas W. Yee

References

Yee, T. W. (2004). A new technique for maximum-likelihood canonical Gaussian ordination. Ecological Monographs, 74, 685–701.

Yee, T. W. (2006). Constrained additive ordination. Ecology, 87, 203–213.

Examples

## Not run:  # Poisson CQO with equal tolerances
set.seed(111)  # This leads to the global solution
hspider[,1:6] <- scale(hspider[,1:6])  # Good when I.tolerances = TRUE
p1 <- cqo(cbind(Alopacce, Alopcune, Alopfabr,
                Arctlute, Arctperi, Auloalbi,
                Pardlugu, Pardmont, Pardnigr,
                Pardpull, Trocterr, Zoraspin) ~
          WaterCon + BareSand + FallTwig +
          CoveMoss + CoveHerb + ReflLux,
          poissonff, data = hspider, eq.tolerances = TRUE)
sort(deviance(p1, history = TRUE))  # Iteration history

(isd.latvar <- apply(latvar(p1), 2, sd))  # Approx isd.latvar

# Refit the model with better initial values
set.seed(111)  # This leads to the global solution
p1 <- cqo(cbind(Alopacce, Alopcune, Alopfabr,
                Arctlute, Arctperi, Auloalbi,
                Pardlugu, Pardmont, Pardnigr,
                Pardpull, Trocterr, Zoraspin) ~
          WaterCon + BareSand + FallTwig +
          CoveMoss + CoveHerb + ReflLux,
          I.tolerances = TRUE, poissonff, data = hspider,
          isd.latvar = isd.latvar)  # Note this
sort(deviance(p1, history = TRUE))  # Iteration history

## End(Not run)

[Package VGAM version 1.1-11 Index]