visualise_effects_data {DImodelsVis}R Documentation

Prepare data for effects plots for compositional data

Description

The helper function to create the underlying data for visualising the effect of increasing or decreasing (or both) the proportion of a variable from a set of compositional variables. This is a special case of the simplex_path function where the end points are either the monoculture (i.e. variable of interest = 1, while all others equal 0) of the variable of interest (when increasing the proportion) or a community without the variable of interest (when decreasing the proportion). The observations specified in 'data' are connected to the respective communities (monoculture of the variable of interest or the community without the variable of interest) by a straight line across the simplex; This has the effect of changing the proportion of the variable of interest whilst adjusting the proportion of the other variables but keeping the ratio of their relative proportions unchanged, thereby preserving the compositional nature of the data. See examples for more information. The output of this function can be passed to the visualise_effects_plot function to visualise the results.

Usage

visualise_effects_data(
  data,
  prop,
  var_interest = NULL,
  effect = c("increase", "decrease", "both"),
  add_var = list(),
  prediction = TRUE,
  ...
)

Arguments

data

A dataframe specifying the initial communities of interest for which to visualise the effect of increasing/decreasing a variable. If a model object is specified then this data should contain all the variables present in the model object including any additional variables not part of the simplex design. If a coefficient vector is specified then data should contain same number of columns as the number of elements in the coefficient vector and a one-to-one positional mapping would be assumed between the data columns and the elements of the coefficient vector.

prop

A vector of column names or indices identifying the columns containing the variable proportions (i.e., compositional columns) in the data.

var_interest

A character vector specifying the variable for which to visualise the effect of change on the response. If left blank, all variables would be assumed to be of interest.

effect

One of "increase", "decrease" or "both" to indicate whether to look at the effect of increasing the proportion, decreasing the proportion or doing both simultaneously, respectively on the response. The default in "increasing".

add_var

A list specifying values for additional variables in the model other than the proportions (i.e. not part of the simplex design). This would be useful to compare the predictions across different values for a categorical variable. One plot will be generated for each unique combination of values specified here.

prediction

A logical value indicating whether to pass the final data to 'add_prediction' and add predictions to the data. Default value is TRUE, but often it would be desirable to make additional changes to the data before making any predictions, so the user can set this to FALSE and manually call the 'add_prediction' function.

...

Arguments passed on to add_prediction

model

A regression model object which will be used to make predictions for the observations in 'data'. Will override 'coefficients' if specified.

coefficients

If a regression model is not available (or can't be fit in R), the regression coefficients from a model fit in some other language can be used to calculate predictions. However, the user would have to ensure there's an appropriate one-to-one positional mapping between the data columns and the coefficient values. Further, they would also have to provide a variance-covariance matrix of the coefficients in the 'vcov' parameter if they want the associated CI for the prediction or it would not be possible to calculate confidence/prediction intervals using this method.

vcov

If regression coefficients are specified, then the variance-covariance matrix of the coefficients can be specified here to calculate the associated confidence interval around each prediction. Failure to do so would result in no confidence intervals being returned. Ensure 'coefficients' and 'vcov' have the same positional mapping with the data.

coeff_cols

If 'coefficients' are specified and a one-to-one positional mapping between the data-columns and coefficient vector is not present. A character string or numeric index can be specified here to reorder the data columns and match the corresponding coefficient value to the respective data column. See the "Use model coefficients for prediction" section in examples.

conf.level

The confidence level for calculating confidence/prediction intervals. Default is 0.95.

interval

Type of interval to calculate:

"none" (default)

No interval to be calculated.

"confidence"

Calculate a confidence interval.

"prediction"

Calculate a prediction interval.

Value

A data frame with the following columns appended at the end

.Sp

An identifier column to discern the variable of interest being modified in each curve.

.Proportion

The value of the variable of interest within the community.

.Group

An identifier column to discern between the different curves.

.add_str_ID

An identifier column for grouping the cartesian product of all additional columns specified in 'add_var' parameter (if 'add_var' is specified).

.Pred

The predicted response for each observation.

.Lower

The lower limit of the prediction/confidence interval for each observation.

.Upper

The upper limit of the prediction/confidence interval for each observation.

.Marginal

The marginal change in the response (first derivative) with respect to the gradual change in the proportion of the species of interest.

.Threshold

A numeric value indicating the maximum proportion of the species of interest within a particular community which has a positive marginal effect on the response.

.MarEffect

A character string entailing whether the increase/decrease of the species of interest from the particular community would result in a positive or negative marginal effect on the response.

.Effect

An identifier column signifying whether considering the effect of species addition or species decrease.

Examples

library(DImodels)

## Load data
data(sim1)

## Fit model
mod <- glm(response ~ p1 + p2 + p3 + p4 + 0, data = sim1)

## Create data for visualising effect of increasing the proportion of
## variable p1 in data
## Notice how the proportion of `p1` increases while the proportion of
## the other variables decreases whilst maintaining their relative proportions
head(visualise_effects_data(data = sim1, prop = c("p1", "p2", "p3", "p4"),
                            var_interest = "p1", effect = "increase",
                            model = mod))

## Create data for visualising the effect of decreasing the proportion
## variable p1 in data using `effect = "decrease"`
head(visualise_effects_data(data = sim1, prop = c("p1", "p2", "p3", "p4"),
                            var_interest = "p1", effect = "decrease",
                            model = mod))

## Create data for visualising the effect of increasing and decreasing the
## proportion variable p3 in data using `effect = "both"`
head(visualise_effects_data(data = sim1, prop = c("p1", "p2", "p3", "p4"),
                            var_interest = "p3", effect = "decrease",
                            model = mod))

## Getting prediction intervals at a 99% confidence level
head(visualise_effects_data(data = sim1, prop = c("p1", "p2", "p3", "p4"),
                            var_interest = "p1", effect = "decrease",
                            model = mod, conf.level = 0.99,
                            interval = "prediction"))

## Adding additional variables to the data using `add_var`
## Notice the new .add_str_ID column in the output
sim1$block <- as.numeric(sim1$block)
new_mod <- update(mod, ~ . + block, data = sim1)
head(visualise_effects_data(data = sim1[, 3:6], prop = c("p1", "p2", "p3", "p4"),
                            var_interest = "p1", effect = "both",
                            model = new_mod,
                            add_var = list("block" = c(1, 2))))

## Create data for visualising effect of decreasing variable p2 from
## the original communities in the data but using model coefficients
## When specifying coefficients the data should have a one-to-one
## positional mapping with specified coefficients.
init_comms <- sim1[, c("p1", "p2", "p3", "p4")]
head(visualise_effects_data(data = init_comms, prop = 1:4,
                            var_interest = "p2",
                            effect = "decrease",
                            interval = "none",
                            coefficients = mod$coefficients))

## Note that to get confidence interval when specifying
## model coefficients we'd also need to provide a variance covariance
## matrix using the `vcov` argument
head(visualise_effects_data(data = init_comms, prop = 1:4,
                            var_interest = "p2",
                            effect = "decrease",
                            interval = "confidence",
                            coefficients = mod$coefficients,
                            vcov = vcov(mod)))

## Can also create only the intermediary communities without predictions
## by specifying prediction = FALSE.
## Any additional columns can then be added and the `add_prediction` function
## can be manually called.
## Note: If calling the `add_prediction` function manually, the data would
## not contain information about the marginal effect of changing the species
## interest
effects_data <- visualise_effects_data(data = init_comms, prop = 1:4,
                                       var_interest = "p2",
                                       effect = "decrease",
                                       prediction = FALSE)
head(effects_data)
## Prediction using model object
head(add_prediction(data = effects_data, model = mod, interval = "prediction"))
## Prediction using regression coefficients
head(add_prediction(data = effects_data, coefficients = mod$coefficients))

[Package DImodelsVis version 1.0.1 Index]