R: using posterior predictive MCMC for modeling quantitative...

pp.mcmc {geiger}

R Documentation

using posterior predictive MCMC for modeling quantitative trait evolution

Description

performs posterior predictive checks for models of quantitative trait evolution. At present, only BM, EB, and clade.shift models are implemented

Usage

pp.mcmc(phy, d, Ngens = 1000000, sampleFreq = 1000, printFreq = 1000,
        prop.width = 1, model = "BM", eb.type = "exponential",
        clade = NULL, rlm.maxit = 20)

Arguments

`phy`	A time calibrated phylogeny in "phylo" format
`d`	A named vector or dataframe of trait values.
`Ngens`	Number of generations that the posterior predictive MCMC will run for. Default is 1 million generations
`sampleFreq`	The frequency with which model parameters will be sampled from the chain and simulations run. Default is every 1000 generations
`printFreq`	The frequency with which the current number of generations and acceptance rates will be printed to screen. Default is every 1000 generations
`prop.width`	The width of the sliding window proposal distribution for ln(Sigmasq) and, if applicable, the exponential change parameter for EB. The width for the EB parameter is obtained by dividing by 10. Default proposal width is 1.
`model`	The model to fit and simulate under. Default is Brownian motion (BM). Other options are early burst (EB) or an edge shift model (edge.shift) where the rate is allowed to change along an internal edge leading to a specified clade (see argument "clade" and Slater and Pennell in press for an example)
`eb.type`	The type of exponential change model assumed. If eb.type = "exponential" (the default), then an exponentially declining rate will be assumed and contrasts will be log transformed when computing the node height test. If eb.type = "linear", a linear decline in rate will be assumed and untransformed contrasts will be used.
`clade`	Default = NULL and is used if model = "BM" or model = "EB". If using model = "edge.shift", then a clade must be specified for which the stem lineage experiences a different rate of evolution. The clade is specified by giving the names of two taxa spanning the clade of interest, e.g. clade = c("A", "B")
`rlm.maxit`	Maximum number of interations to use for the iteratively reweighted least squares optimization of the robust regression algorithm (see ?rlm). Default is 20 and should be sufficient for most problems

Details

This function runs a posterior predictive MCMC under the specified model, sampling model parameters from their posterior distributions and simulating under that model. Simulated data are summarized using the Node height test (Freckleton and Harvey 2006) slope (OLS and robust regression) and Morphological Disparity Index (Harmon et al. 2003). Model adequacy can then be assessed by comparing observed values for these summary statistics to the posterior predictive distributions

Value

A dataframe containing the following columns:

`$generation`	the generation at which parameters where sampled and simulations conducted
`$logLk`	The sampled logLikelihood values for the model
`$Sigma`	Brownian rate parameter values
`$node.height.slope.lm`	posterior predictive distribution of slopes for the node height test using an ordinary least squares regression
`$node.height.slope.rlm`	posterior predictive distribution of slopes for the node height test using a robust regression
`$MDI`	posterior predictive distribution of MDI values

Author(s)

Graham Slater and Matthew Pennell

References

Slater GJ and MW Pennell (2014) Robust regression and posterior predictive simulation increase power to detect early bursts of trait evolution. Systematic Biology.

Freckleton RP and PH Harvey (2006) Detecting non-brownian evolution in adaptive radiations. PLoS Biology 4:e373.

Harmon LJ, JA Schulte, A Larson, and JB Losos (2003). Tempo and mode of evolutionary readiations in iguanian lizards. Science 301:961-964.

Examples


data(whales)

tmp <- treedata(whales$phy, whales$dat[,1])

phy <- tmp$phy
dat <- tmp$data[,1]

## compute observed statistics

nht.ols <- nh.test(phy, dat, regression.type = "lm",
log = TRUE, show.plot = FALSE)$coefficients[2,1]

nht.rlm <- nh.test(phy, dat, regression.type = "rlm",
 log = TRUE, show.plot = FALSE)$coefficients[2,1]

mdi.exp <- 0

#---- run short pp.mcmc

pp.eb <- pp.mcmc(phy, dat, Ngens = 1000, sampleFreq = 10, printFreq = 100, model ="EB")

# ---- plot results

# quartz(width = 5, height = 7)
par(mar = c(4,5,1,1))
par(mfcol = c(3,1))


hist(pp.eb$MDI, col = "gray", border = "gray", main = NULL, xlab = "pp.MDI",
ylab = "Frequency", cex.axis = 1.2)

abline(v = mdi.exp, col = "black", lwd = 3, lty = 2)

mdi.p <- length(which(pp.eb$MDI<=0))/length(pp.eb$MDI)

hist(pp.eb$node.height.slope.lm, col = "gray", border = "gray", main = NULL, xlab = "pp.nht_ols",
ylab = "Frequency", cex.axis = 1.2)

abline(v = nht.ols, col = "black", lwd = 3, lty = 2)

node.height.ols.p <- length(which(pp.eb$node.height.slope.lm <= nht.ols)) /
(length(pp.eb$node.height.slope.lm) +1)


hist(pp.eb$node.height.slope.rlm, col = "gray", border = "gray", main = NULL, xlab = "pp.nht_ols",
ylab = "Frequency", cex.axis = 1.2)

abline(v = nht.rlm, col = "black", lwd = 3, lty = 2)

node.height.rr.p <- length(which(pp.eb$node.height.slope.rlm <= nht.rlm)) /
(length(pp.eb$node.height.slope.rlm) +1)

[Package geiger version 2.0.11 Index]