twoterm.plmm.stan {phylopairs} | R Documentation |
twoterm.plmm.stan
Description
Fits the two-term plmm model of Castillo (2007) in a Bayesian framework. Bayesian sampling is conducted in the Stan software via the 'rstan' package. Users supply vectors of observations and an ultrametric phylogenetic tree. Users can alter parameters for model-parameter prior distributions and Bayesian sampling settings. See details.
Usage
twoterm.plmm.stan(des, y, sp1s, sp2s, tree,
iter=2000, chains=4, coef.u=0, coef.sd=10, physig2.u=-1,
physig2.sd=1, randsig2.loc=0, randsig2.sc=2.5, cores=4, ...)
Arguments
des |
A vector of predictor variable observations OR, in the case of multiple predictors, a matrix in which each column is a vector of observations of a given predictor. Function will add a column of 1s to make this a design matrix whose first column corresponds to the model intercept (unless such a column already exists). |
y |
A vector of response variable observations. |
sp1s |
Character vector naming the "species 1" of every pair. IMPORTANT: name formatting must match that used in the tip.label of the tree |
sp2s |
Character vector naming the "species 2" of every pair. IMPORTANT: name formatting must match that used in the tip.label of the tree |
tree |
Phylogenetic tree (a 'phylo' object) for the species included in the analysis (regardless of whether they are species 1 or 2). |
iter |
Number of iterations to run on each chain; defaults to 2000 (more are often necessary). |
chains |
Number of Markov chains to run; defaults to 4. |
coef.u |
Mean of the Gaussian prior for each preditor variable coefficient; defaults to 0. |
coef.sd |
SD of the Gaussian prior for each preditor variable coefficient; defaults to 10. |
physig2.u |
Mean of the prior distribution (lognormal) for the scale of the phylogenetic component of residual covariance; defaults to -1. |
physig2.sd |
SD of the prior distribution (lognormal) for the scale of the phylogenetic component of residual covariance; defaults to 1. |
randsig2.loc |
Location parameter of the prior distribution (cauchy) for the scale of the independent component of residual covariance ; defaults to 0. |
randsig2.sc |
Scale parameter of the prior distribution (cauchy) for the scale of the independent component of residual covariance ; defaults to 2.5. |
cores |
Number of cores to use. |
... |
additional arguments passed to |
Details
The model introduced by Castillo (2007) for analyzing lineage-pair data is a
version of a phylogenetic linear mixed model (plmm) in which there are two
random-effect terms, one for the 'species 1' and another for the 'species 2' in
every pair. The model is Y = Xb + Z1u1 + Z1u2 + epsilon
, where u1 and u2 are
vectors of species-specific random effects that covary according to a phylogenetic
covariance matrix C. In this model, u1 ~ N(0,physig2*C)
, u2 ~ N(0,physig2*C)
, and
epsilon~N(0, randsig2*I)
, where randsig2 is the scaling parameter for identity
matrix I and physig2 is the scaling parameter for phylogenetic covariance
matrix C. See twoterm_plmm_mats
for details on calculating
the Z1 and Z2 matrices.
Prior Distributions for Model Parameters: The underlying stan models assume the following prior distributions for model parameters
Regression coefficients: Gaussian prior (users can set prior mean and sd)
-
physig2
: lognormal prior (users can set prior mean and sd) -
randsig2
: cauchy prior (users can set location and scale parameters of prior)
Value
A list containing two elements: (1) the posterior distribution of model parameters, and (2) the log-likelihood of the posteriors for use in downstream analyses (e.g. the calculation of model fitting metrics like loo or waic)
References
Castillo, D. M. (2007). Factors contributing to the accumulation of reproductive isolation: A mixed model approach. Ecology and Evolution 7:5808-5820. doi.org/10.1002/ece3.3093
Examples
# Load a dataset simulated with a non-independent response observations
data(data3)
# Also load the simulated tree that was used to generate those pairs
data(sim.tree1)
# Fit an OLS model
result1 = linreg.stan(des=data3[,3], y=data3[,4], cores=2)
# Fit the twoterm.plmm.stan model
result2 = twoterm.plmm.stan(des=data3[,3], y=data3[,4], sp1s=data3[,1],
sp2s=data3[,2], tree=sim.tree1, cores=2)
# Compare posterior parameter estimates
result1[[1]]
result2[[1]]
# Compare the fit of the two models via loo and waic
loo1 = loo::loo(result1[[2]])
loo2 = loo::loo(result2[[2]])
waic1 = loo::waic(result1[[2]])
waic2 = loo::waic(result2[[2]])
loo1
loo2
waic1
waic2
loo::loo_compare(loo1, loo2)
loo::loo_compare(waic1, waic2)