influ_phyglm {sensiPhy} | R Documentation |
Influential species detection - Phylogenetic Logistic Regression
Description
Performs leave-one-out deletion analysis for phylogenetic logistic regression, and detects influential species.
Usage
influ_phyglm(formula, data, phy, btol = 50, cutoff = 2, track = TRUE, ...)
Arguments
formula |
The model formula |
data |
Data frame containing species traits with row names matching tips
in |
phy |
A phylogeny (class 'phylo') matching |
btol |
Bound on searching space. For details see |
cutoff |
The cutoff value used to identify for influential species (see Details) |
track |
Print a report tracking function progress (default = TRUE) |
... |
Further arguments to be passed to |
Details
This function sequentially removes one species at a time, fits a phylogenetic
logistic regression model using phyloglm
, stores the
results and detects influential species.
Currently only logistic regression using the "logistic_MPLE"-method from
phyloglm
is implemented.
influ_phyglm
detects influential species based on the standardised
difference in intercept and/or slope when removing a given species compared
to the full model including all species. Species with a standardised difference
above the value of cutoff
are identified as influential. The default
value for the cutoff is 2 standardised differences change.
Currently, this function can only implement simple logistic models (i.e. trait~
predictor
). In the future we will implement more complex models.
Output can be visualised using sensi_plot
.
Value
The function influ_phyglm
returns a list with the following
components:
cutoff
: The value selected for cutoff
formula
: The formula
full.model.estimates
: Coefficients, aic and the optimised
value of the phylogenetic parameter (i.e. alpha
) for the full model
without deleted species.
influential_species
: List of influential species, both
based on standardised difference in intercept and in the slope of the
regression. Species are ordered from most influential to less influential and
only include species with a standardised difference > cutoff
.
sensi.estimates
: A data frame with all simulation
estimates. Each row represents a deleted clade. Columns report the calculated
regression intercept (intercept
), difference between simulation
intercept and full model intercept (DIFintercept
), the standardised
difference (sDIFintercept
), the percentage of change in intercept compared
to the full model (intercept.perc
) and intercept p-value
(pval.intercept
). All these parameters are also reported for the regression
slope (DIFestimate
etc.). Additionally, model aic value (AIC
) and
the optimised value (optpar
) of the phylogenetic parameter
(i.e. alpha
) are reported.
data
: Original full dataset.
errors
: Species where deletion resulted in errors.
Author(s)
Gustavo Paterno & Gijsbert D.A. Werner
References
Paterno, G. B., Penone, C. Werner, G. D. A. sensiPhy: An r-package for sensitivity analysis in phylogenetic comparative methods. Methods in Ecology and Evolution 2018, 9(6):1461-1467.
Ho, L. S. T. and Ane, C. 2014. "A linear-time algorithm for Gaussian and non-Gaussian trait evolution models". Systematic Biology 63(3):397-408.
See Also
phyloglm
, samp_phyglm
,
influ_phylm
, sensi_plot
Examples
# Simulate Data:
set.seed(6987)
phy = rtree(100)
x = rTrait(n=1,phy=phy)
X = cbind(rep(1,100),x)
y = rbinTrait(n=1,phy=phy, beta=c(-1,0.5), alpha=.7 ,X=X)
dat = data.frame(y, x)
# Run sensitivity analysis:
influ <- influ_phyglm(y ~ x, data = dat, phy = phy)
# To check summary results and most influential species:
summary(influ)
# Visual diagnostics for clade removal:
sensi_plot(influ)