R: Generalized least squares rate model

rate_gls {evolvability}

R Documentation

Generalized least squares rate model

Description

rate_gls fits a generalized least squares model to estimate parameters of an evolutionary model of two traits x and y, where the evolutionary rate of y depends on the value of x. Three models are implemented. In the two first, 'predictor_BM' and 'predictor_gBM', the evolution of y follows a Brownian motion with variance linear in x, while the evolution of x either follows a Brownian motion or a geometric Brownian motion, respectively. In the third model, 'recent_evol', the residuals of the macroevolutionary predictions of y have variance linear in x. It is highly recommended to read Hansen et al. (in review) and vignette("Analyzing_rates_of_evolution") before fitting these models.

Usage

rate_gls(
  x,
  y,
  species,
  tree,
  model = "predictor_BM",
  startv = list(a = NULL, b = NULL),
  maxiter = 100,
  silent = FALSE,
  useLFO = TRUE,
  tol = 0.001
)

Arguments

`x`	The explanatory variable, which must be equal to the length of `y` and tips on the tree.
`y`	The trait values of response variable. Note that the algorithm mean centers y.
`species`	A vector with the names of the species, must be equal in length and in the same order as `x` and `y`.
`tree`	An object of class `phylo`, needs to be ultrametric and with total length of unit, tips must have the same names as in `species`.
`model`	The acronym of the evolutionary model to be fitted. There are three options: 'predictor_BM', 'predictor_gBM' or 'recent_evol' (see details).
`startv`	A vector of optional starting values for the a and b parameters.
`maxiter`	The maximum number of iterations for updating the GLS.
`silent`	logical: if the function should not print the generalized sum of squares for each iteration.
`useLFO`	logical: whether the focal species should be left out when calculating the corresponding species' means. Note that this is only relevant for the 'recent_evol' model. The most correct is to use `TRUE`, but in practice it has little effect and `FALSE` will speed up the model fit (particularly useful when bootstrapping). LFO is an acronym for 'Leave the Focal species Out'.
`tol`	tolerance for convergence. If the change in 'a' and 'b' is below this limit between the two last iteration, convergence is reached. The change is measured in proportion to the standard deviation of the response for 'a' and the ratio of the standard deviation of the response to the standard deviation of the predictor for 'b'.

Details

rate_gls is an iterative generalized least squares (GLS) model fitting a regression where the response variable is a vector of squared mean-centered y-values for the 'predictor_BM' and 'predictor_gBM' models and squared deviation from the evolutionary predictions (see macro_pred) for the 'recent_evol' model. Note that the algorithm mean centers x in the 'predictor_BM' and 'recent_evol' analyses, while it mean standardized x (i.e. divided x by its mean) in the 'predictor_gBM'. The evolutionary parameters a and b are inferred from the intercept and the slope of the GLS fit. Again, it is highly recommended to read Hansen et al. (in review) and vignette("Analyzing_rates_of_evolution") before fitting these models. In Hansen et al. (2021) the three models 'predictor_BM', 'predictor_gBM' and 'recent_evol' are referred to as 'Model 1', 'Model 2' and 'Model 3', respectively.

Value

An object of class 'rate_gls', which is a list with the following components:

`model`				The name of the model ('predictor_BM', 'predictor_gBM' or 'recent_evolution').
`param`				The focal parameter estimates and their standard errors, where 'a' and 'b' are parameters of the evolutionary models, and 'sigma(x)^2' is the BM-rate parameter of x for the 'predictor_BM' model, the BM-rate parameter for log x for the 'predictor_gBM' model, and the variance of x for the 'recent_evolution' model.
`Rsquared`				The generalized R squared of the GLS model fit.
`a_all_iterations`				The values for the parameter a in all iterations.
`b_all_iterations`				The values for the parameter b in all iterations.
`R`				The residual variance matrix.
`Beta`				The intercept and slope of GLS regression (response is y2 and explanatory variable is x).
`Beta_vcov`				The error variance matrix of `Beta`.
`tree`				The phylogenetic tree.
`data`				The data used in the GLS regression.
`convergence`				Whether the algorithm converged or not.
`additional_param`				Some additional parameter estimates.
`call`				The function call.

Author(s)

Geir H. Bolstad

References

Hansen TF, Bolstad GH, Tsuboi M. 2021. Analyzing disparity and rates of morphological evolution with model-based phylogenetic comparative methods. *Systematic Biology*. syab079. doi:10.1093/sysbio/syab079

Examples

# Also see vignette("Analyzing_rates_of_evolution").
## Not run: 
# Generating a tree with 500 species
set.seed(102)
tree <- ape::rtree(n = 500)
tree <- ape::chronopl(tree, lambda = 1, age.min = 1)

### model = 'predictor_BM' ###
sim_data <- simulate_rate(tree,
  startv_x = 0, sigma_x = 0.25, a = 1, b = 1, model =
    "predictor_BM"
)
head(sim_data$tips)
gls_mod <- rate_gls(
  x = sim_data$tips$x, y = sim_data$tips$y,
  species = sim_data$tips$species, tree, model = "predictor_BM"
)
gls_mod$param
par(mfrow = c(1, 2))
# Response shown on the standard deviation scale (default):
plot(gls_mod, scale = "SD", cex.legend = 0.8)
# Response shown on the variance scale, where the regression is linear:
plot(gls_mod, scale = "VAR", cex.legend = 0.8)
par(mfrow = c(1, 1))
# Parametric bootstrapping to get the uncertainty of the parameter estimates
# taking the complete process into account.
# (this takes some minutes)
gls_mod_boot <- rate_gls_boot(gls_mod, n = 1000)
gls_mod_boot$summary

### model = 'predictor_gBM' ###
sim_data <- simulate_rate(tree,
  startv_x = 1, sigma_x = 1, a = 1, b = 1,
  model = "predictor_gBM"
)
head(sim_data$tips)
gls_mod <- rate_gls(
  x = sim_data$tips$x, y = sim_data$tips$y, species = sim_data$tips$species,
  tree, model = "predictor_gBM"
)
gls_mod$param
plot(gls_mod)
par(mfrow = c(1, 2))
# Response shown on the standard deviation scale (default):
plot(gls_mod, scale = "SD", cex.legend = 0.8)
# Response shown on the variance scale, where the regression is linear:
plot(gls_mod, scale = "VAR", cex.legend = 0.8)
# is linear.
par(mfrow = c(1, 1))

# Parametric bootstrapping to get the uncertainty of the parameter estimates
# taking the complete process into account. (This takes some minutes.)
gls_mod_boot <- rate_gls_boot(gls_mod, n = 1000)
gls_mod_boot$summary

### model = 'recent_evol' ###
sim_data <- simulate_rate(tree,
  startv_x = 0, sigma_x = 1, a = 1, b = 1, sigma_y = 1,
  model = "recent_evol"
)
head(sim_data$tips)
gls_mod <- rate_gls(
  x = sim_data$tips$x, y = sim_data$tips$y, species = sim_data$tips$species,
  tree, model = "recent_evol", useLFO = FALSE
)
# useLFO = TRUE is somewhat slower, and although more correct it should give
# very similar estimates in most situations.
gls_mod$param
par(mfrow = c(1, 2))
# Response shown on the standard deviation scale (default):
plot(gls_mod, scale = "SD", cex.legend = 0.8)
# Response shown on the variance scale, where the regression is linear:
plot(gls_mod, scale = "VAR", cex.legend = 0.8)
# linear.
par(mfrow = c(1, 1))

# Parametric bootstrapping to get the uncertainty of the parameter estimates
# taking the complete process into account. Note that x is considered as
# fixed effect. (This takes a long time.)
gls_mod_boot <- rate_gls_boot(gls_mod, n = 1000, useLFO = FALSE)
gls_mod_boot$summary

## End(Not run)

[Package evolvability version 2.0.0 Index]