R: Compare growth curves

marginal_comparison.bgmfit {bsitar}

R Documentation

Compare growth curves

Description

The marginal_comparison() function estimates and compare growth curves such as distance and velocity. This function is a wrapper around the marginaleffects::comparisons() and marginaleffects::avg_comparisons(). The marginaleffects::comparisons() computes unit-level (conditional) estimates whereas marginaleffects::avg_comparisons() return average (marginal) estimates. A detailed explanation is available here.

Usage

## S3 method for class 'bgmfit'
marginal_comparison(
  model,
  resp = NULL,
  ndraws = NULL,
  draw_ids = NULL,
  newdata = NULL,
  datagrid = NULL,
  re_formula = NA,
  allow_new_levels = FALSE,
  sample_new_levels = "gaussian",
  xrange = 1,
  digits = 2,
  numeric_cov_at = NULL,
  aux_variables = NULL,
  levels_id = NULL,
  avg_reffects = NULL,
  idata_method = NULL,
  ipts = NULL,
  seed = 123,
  future = FALSE,
  future_session = "multisession",
  cores = NULL,
  average = FALSE,
  plot = FALSE,
  showlegends = NULL,
  variables = NULL,
  deriv = NULL,
  deriv_model = NULL,
  comparison = "difference",
  type = NULL,
  by = FALSE,
  conf_level = 0.95,
  transform = NULL,
  cross = FALSE,
  wts = NULL,
  hypothesis = NULL,
  equivalence = NULL,
  eps = NULL,
  reformat = NULL,
  estimate_center = NULL,
  estimate_interval = NULL,
  dummy_to_factor = NULL,
  verbose = FALSE,
  expose_function = FALSE,
  usesavedfuns = NULL,
  clearenvfuns = NULL,
  envir = NULL,
  ...
)

marginal_comparison(model, ...)

Arguments

`model`	An object of class `bgmfit`.
`resp`	A character string (default `NULL`) to specify response variable when processing posterior draws for the `univariate_by` and `multivariate` models. See `bsitar()` for details on `univariate_by` and `multivariate` models
`ndraws`	A positive integer indicating the number of posterior draws to be used in estimation. If `NULL` (default), all draws are used.
`draw_ids`	An integer indicating the specific posterior draw(s) to be used in estimation (default `NULL`).
`newdata`	An optional data frame to be used in estimation. If `NULL` (default), the `newdata` is retrieved from the `model`.
`datagrid`	Generate a grid of user-specified values for use in the `newdata` argument in various functions of the marginaleffects package. This is useful to define where in the predictor space we want to evaluate the quantities of interest. See `marginaleffects::datagrid()` for details. The default value for the `datagrid` is `NULL` implying that no custom grid is constructed. To set a data grid, the argument should be a data.frame constructed by using the `marginaleffects::datagrid()` function, or else a named list which are internally used for setting up the grid. For the user convenience, we also allow setting an empty list `datagrid = list()` in which case essential arguments such as `model`, `newdata` are taken up from the respective arguments specified elsewhere. Further, the level 1 predictor (such as age) and any covariate included in the model fit (e.g., gender) are also automatically inferred from the `model` object.
`re_formula`	Option to indicate whether or not to include the individual/group-level effects in the estimation. When `NA` (default), the individual-level effects are excluded and therefore population average growth parameters are computed. When `NULL`, individual-level effects are included in the computation and hence the growth parameters estimates returned are individual-specific. In both situations, (i.e,, `NA` or `NULL`), continuous and factor covariate(s) are appropriately included in the estimation. The continuous covariates by default are set to their means (see `numeric_cov_at` for details) whereas factor covariates are left unaltered thereby allowing estimation of covariate specific population average and individual-specific growth parameter.
`allow_new_levels`	A flag indicating if new levels of group-level effects are allowed (defaults to `FALSE`). Only relevant if `newdata` is provided.
`sample_new_levels`	Indicates how to sample new levels for grouping factors specified in `re_formula`. This argument is only relevant if `newdata` is provided and `allow_new_levels` is set to `TRUE`. If `"uncertainty"` (default), each posterior sample for a new level is drawn from the posterior draws of a randomly chosen existing level. Each posterior sample for a new level may be drawn from a different existing level such that the resulting set of new posterior draws represents the variation across existing levels. If `"gaussian"`, sample new levels from the (multivariate) normal distribution implied by the group-level standard deviations and correlations. This options may be useful for conducting Bayesian power analysis or predicting new levels in situations where relatively few levels where observed in the old_data. If `"old_levels"`, directly sample new levels from the existing levels, where a new level is assigned all of the posterior draws of the same (randomly chosen) existing level.
`xrange`	An integer to set the predictor range (i.e., age) when executing the interpolation via `ipts`. The default `NULL` sets the individual specific predictor range whereas code `xrange = 1` sets identical range for individuals within the same higher grouping variable (e.g., study). Code `xrange = 2` sets the identical range across the entire sample. Lastly, a paired numeric values can be supplied e.g., `xrange = c(6, 20)` to set the range within those values.
`digits`	An integer (default `2`) to set the decimal places for the estimates. The `digits` is passed on to the `base::round()` function.
`numeric_cov_at`	An optional (named list) argument to specify the value of continuous covariate(s). The default `NULL` option set the continuous covariate(s) at their mean. Alternatively, a named list can be supplied to manually set these values. For example, `numeric_cov_at = list(xx = 2)` will set the continuous covariate varibale 'xx' at 2. The argument `numeric_cov_at` is ignored when no continuous covariate is included in the model.
`aux_variables`	An optional argument to specify the variable(s) that can be passed to the `ipts` argument (see below). This is useful when fitting location scale models and measurement error models. An indication to use `aux_variables` is when post processing functions throw an error such as `variable 'x' not found either 'data' or 'data2'`
`levels_id`	An optional argument to specify the `ids` for hierarchical model (default `NULL`). It is used only when model is applied to the data with 3 or more levels of hierarchy. For a two level model, the `levels_id` is automatically inferred from the model fit. Even for 3 or higher level model, the `levels_id` is inferred from the model fit but under the assumption that hierarchy is specified from lowest to upper most level i.e, `id` followed by `study` where `id` is nested within the `study` Note that it is not guaranteed that the `levels_id` is sorted correctly, and therefore it is better to set it manually when fitting a model with three or more levels of hierarchy.
`avg_reffects`	An optional argument (default `NULL`) to calculate (marginal/average) curves and growth parameters such as APGV and PGV. If specified, it must be a named list indicating the `over` (typically level 1 predictor, such as age), `feby` (fixed effects, typically a factor variable), and `reby` (typically `NULL` indicating that parameters are integrated over the random effects) such as `avg_reffects = list(feby = 'study', reby = NULL, over = 'age')`.
`idata_method`	A character string to indicate the interpolation method. The number of of interpolation points is set up the `ipts` argument. Options available for `idata_method` are method 1 (specified as `'m1'`) and method 2 (specified as `'m2'`). The method 1 (`'m1'`) is adapted from the the iapvbs package and is documented here https://rdrr.io/github/Zhiqiangcao/iapvbs/src/R/exdata.R whereas method 2 (`'m2'`) is based on the JMbayes package as documented here https://github.com/drizopoulos/JMbayes/blob/master/R/dynPred_lme.R. The `'m1'` method works by internally constructing the data frame based on the model configuration whereas the method `'m2'` uses the exact data frame used in model fit and can be accessed via `fit$data`. If `idata_method = NULL, default`, then method `'m2'` is automatically set. Note that method `'m1'` might fail in some cases when model involves covariates particularly when model is fit as `univariate_by`. Therefore, it is advised to switch to method `'m2'` in case `'m1'` results in error.
`ipts`	An integer to set the length of the predictor variable to get a smooth velocity curve. The `NULL` will return original values whereas an integer such as `ipts = 10` (default) will interpolate the predictor. It is important to note that these interpolations do not alter the range of predictor when calculating population average and/or the individual specific growth curves.
`seed`	An integer (default `123`) that is passed to the estimation method.
`future`	A logical (default `FALSE`) to specify whether or not to perform parallel computations. If set to `TRUE`, the `future.apply::future_sapply()` function is used to summarize draws.
`future_session`	A character string to set the session type when `future = TRUE`. The `'multisession'` (default) options sets the multisession whereas the `'multicore'` sets the multicore session. Note that option `'multicore'` is not supported on Windows systems. For more details, see `future.apply::future_sapply()`.
`cores`	Number of cores to be used when running the parallel computations (if `future = TRUE`). On non-Windows systems this argument can be set globally via the mc.cores option. For the default `NULL` option, the number of cores are set automatically by calling the `future::availableCores()`. The number of cores used are the maximum number of cores avaialble minus one, i.e., `future::availableCores() - 1`.
`average`	A logical to indicate whether to internally call the `marginaleffects::comparisons()` or the `marginaleffects::avg_comparisons()` function. If `FALSE` (default), `marginaleffects::comparisons()` is called otherwise `marginaleffects::avg_comparisons()` when `average = TRUE`.
`plot`	A logical to specify whether to plot comparisons by calling the `marginaleffects::plot_comparisons()` function (`FALSE`) or not (`FALSE`). If `FALSE` (default), then `marginaleffects::comparisons()` or `marginaleffects::avg_comparisons()` are called to compute predictions (see `average` for details)
`showlegends`	An argument to specify whether to show legends (`TRUE`) or not (`FALSE`). If `NULL` (default), then `showlegends` is internally set to `TRUE` if `re_formula = NA`, and `FALSE` if `re_formula = NULL`.
`variables`	For estimating growth parameters in the current use case, the `variables` is the level 1 predictor such as `age`/`time`. The `variables` is a named list where value is set via the `esp` argument (default 1e-6). If `NULL`, the `variables` is set internally by retrieving the relevant information from the `model`. Otherwise, user can define it as follows: `variables = list('x' = 1e-6)` where `'x'` is the level 1 predictor. Note that `variables = list('age' = 1e-6)` is the default behavior for the marginaleffects because velocity is typically calculated by differentiating the distance curve via `dydx` approach, and therefore argument `deriv` is automatically set as `0` and `deriv_model` as `FALSE`. If user want to estimate parameters based on the model based first derivative, then argument `deriv` must be set as `1` and internally argument `variables` is defined as `variables = list('age' = 0)` i.e, original level 1 predictor variable, `'x'`. It is important to consider that if default behavior is used i.e, `deriv = 0` and `variables = list('x' = 1e-6)`, then user can not pass additional arguments to the `variables` argument. On the other hand, alternative approach i.e, `deriv = 0` and `variables = list('x' = 0)`, additional options can be passed to the `marginaleffects::comparisons()` and `marginaleffects::avg_comparisons()` functions.
`deriv`	A numeric to specify whether to estimate parameters based on the differentiation of the distance curve or the model based first derivative. Please see argument `variables` for more details.
`deriv_model`	A logical to specify whether to estimate velocity curve from the derivative function, or the differentiation of the distance curve. The argument `deriv_model` is set to `TRUE` for those functions which need velocity curve such as `growthparameters()` and `plot_curves()`, and `NULL` for functions which explicitly use the distance curve (i.e., fitted values) such as `loo_validation()` and `plot_ppc()`.
`comparison`	For estimating growth parameters in the current use case, options allowed for the `comparison` are `'difference'` and `'differenceavg'`. Note that `comparison` is a placeholder and is only used to setup the the internal function that estimates `'parameter'` via `sitar::getPeak()`, `sitar::getTakeoff()` and `sitar::getTrough()` functions to estimate various growth parameters. Options `'difference'` and `'differenceavg'` are internally restructured according to the user specified `hypothesis` argument.
`type`	string indicates the type (scale) of the predictions used to compute contrasts or slopes. This can differ based on the model type, but will typically be a string such as: "response", "link", "probs", or "zero". When an unsupported string is entered, the model-specific list of acceptable values is returned in an error message. When `type` is `NULL`, the first entry in the error message is used by default.
`by`	Aggregate unit-level estimates (aka, marginalize, average over). Valid inputs: `FALSE`: return the original unit-level estimates. `TRUE`: aggregate estimates for each term. Character vector of column names in `newdata` or in the data frame produced by calling the function without the `by` argument. Data frame with a `by` column of group labels, and merging columns shared by `newdata` or the data frame produced by calling the same function without the `by` argument. See examples below. For more complex aggregations, you can use the `FUN` argument of the `hypotheses()` function. See that function's documentation and the Hypothesis Test vignettes on the `marginaleffects` website.
`conf_level`	numeric value between 0 and 1. Confidence level to use to build a confidence interval.
`transform`	string or function. Transformation applied to unit-level estimates and confidence intervals just before the function returns results. Functions must accept a vector and return a vector of the same length. Support string shortcuts: "exp", "ln"
`cross`	`FALSE`: Contrasts represent the change in adjusted predictions when one predictor changes and all other variables are held constant. `TRUE`: Contrasts represent the changes in adjusted predictions when all the predictors specified in the `variables` argument are manipulated simultaneously (a "cross-contrast").
`wts`	string or numeric: weights to use when computing average contrasts or slopes. These weights only affect the averaging in `⁠avg_*()⁠` or with the `by` argument, and not the unit-level estimates themselves. Internally, estimates and weights are passed to the `weighted.mean()` function. string: column name of the weights variable in `newdata`. When supplying a column name to `wts`, it is recommended to supply the original data (including the weights variable) explicitly to `newdata`. numeric: vector of length equal to the number of rows in the original data or in `newdata` (if supplied).
`hypothesis`	specify a hypothesis test or custom contrast using a numeric value, vector, or matrix, a string, or a string formula. Numeric: Single value: the null hypothesis used in the computation of Z and p (before applying `transform`). Vector: Weights to compute a linear combination of (custom contrast between) estimates. Length equal to the number of rows generated by the same function call, but without the `hypothesis` argument. Matrix: Each column is a vector of weights, as describe above, used to compute a distinct linear combination of (contrast between) estimates. The column names of the matrix are used as labels in the output. String formula to specify linear or non-linear hypothesis tests. If the `term` column uniquely identifies rows, terms can be used in the formula. Otherwise, use `b1`, `b2`, etc. to identify the position of each parameter. The `⁠b⁠` wildcard can be used to test hypotheses on all estimates. Examples: `hp = drat` `hp + drat = 12` `b1 + b2 + b3 = 0` `⁠b / b1 = 1⁠` String: "pairwise": pairwise differences between estimates in each row. "reference": differences between the estimates in each row and the estimate in the first row. "sequential": difference between an estimate and the estimate in the next row. "revpairwise", "revreference", "revsequential": inverse of the corresponding hypotheses, as described above. See the Examples section below and the vignette: https://marginaleffects.com/vignettes/hypothesis.html
`equivalence`	Numeric vector of length 2: bounds used for the two-one-sided test (TOST) of equivalence, and for the non-inferiority and non-superiority tests. See Details section below.
`eps`	NULL or numeric value which determines the step size to use when calculating numerical derivatives: (f(x+eps)-f(x))/eps. When `eps` is `NULL`, the step size is 0.0001 multiplied by the difference between the maximum and minimum values of the variable with respect to which we are taking the derivative. Changing `eps` may be necessary to avoid numerical problems in certain models.
`reformat`	A logical (default `TRUE`) to reformat the output returned by the `marginaleffects` as a data.frame with column names re-defined as follows: `conf.low` as `Q2.5`, and `conf.high` as `Q97.5` (assuming that `conf_int = 0.95`). Also, following columns are dropped from the data frame: `term`, `contrast`, `tmp_idx`, `predicted_lo`, `predicted_hi`, `predicted`.
`estimate_center`	A character string (default `NULL`) to specify whether to center estimate as `'mean'` or as `'median'`. Note that `estimate_center` is used to set the global options as follows: `options("marginaleffects_posterior_center" = "mean")`, or `options("marginaleffects_posterior_center" = "median")` The pre-specified global options are restored on exit via the `base::on.exit()`.
`estimate_interval`	A character string (default `NULL`) to specify whether to compute credible intervals as equal-tailed intervals, `'eti'` or highest density intervals, `'hdi'`. Note that `estimate_interval` is used to set the global options as follows: `options("marginaleffects_posterior_interval" = "eti")`, or `options("marginaleffects_posterior_interval" = "hdi")` The pre-specified global options are restored on exit via the `base::on.exit()`.
`dummy_to_factor`	A named list (default `NULL`) that is used to convert dummy variables into a factor variable. The named elements are `factor.dummy`, `factor.name`, and `factor.level`. The `factor.dummy` is a vector of character strings that need to be converted to a factor variable whereas the `factor.name` is a single character string that is used to name the newly created factor variable. The `factor.level` is used to name the levels of newly created factor. When `factor.name` is `NULL`, then the factor name is internally set as `'factor.var'`. If `factor.level` is `NULL`, then names of factor levels are take from the `factor.dummy` i.e., the factor levels are assigned same name as `factor.dummy`. Note that when `factor.level` is not `NULL`, its length must be same as the length of the `factor.dummy`.
`verbose`	An optional argument (logical, default `FALSE`) to indicate whether to print information collected during setting up the object(s).
`expose_function`	An optional logical argument to indicate whether to expose Stan functions (default `FALSE`). Note that if user has already exposed Stan functions during model fit by setting `expose_function = TRUE` in the `bsitar()`, then those exposed functions are saved and can be used during post processing of the posterior draws and therefore `expose_function` is by default set as `FALSE` in all post processing functions except `optimize_model()`. For `optimize_model()`, the default setting is `expose_function = NULL`. The reason is that each optimized model has different Stan function and therefore it need to be re exposed and saved. The `expose_function = NULL` implies that the setting for `expose_function` is taken from the original `model` fit. Note that `expose_function` must be set to `TRUE` when adding `fit criteria` and/or `bayes_R2` during model optimization.
`usesavedfuns`	A logical (default `NULL`) to indicate whether to use the already exposed and saved `Stan` functions. Depending on whether the user have exposed Stan functions within the `bsitar()` call via `expose_functions` argument in the `bsitar()`, the `usesavedfuns` is automatically set to `TRUE` (if `expose_functions = TRUE`) or `FALSE` (if `expose_functions = FALSE`). Therefore, manual setting of `usesavedfuns` as `TRUE`/`FALSE` is rarely needed. This is for internal purposes only and mainly used during the testing of the functions and therefore should not be used by users as it might lead to unreliable estimates.
`clearenvfuns`	A logical to indicate whether to clear the exposed function from the environment (`TRUE`) or not (`FALSE`). If `NULL` (default), then `clearenvfuns` is set as `TRUE` when `usesavedfuns` is `TRUE`, and `FALSE` if `usesavedfuns` is `FALSE`.
`envir`	Environment used for function evaluation. The default is `NULL` which will set `parent.frame()` as default environment. Note that since most of post processing functions are based on brms, the functions needed for evaluation should be in the `.GlobalEnv`. Therefore, it is strongly recommended to set `envir = globalenv()` (or `envir = .GlobalEnv`). This is particularly true for the derivatives such as velocity curve.
`...`	Further arguments passed to `brms::fitted.brmsfit()` and `brms::predict()` functions.

Value

A data frame objects with estimates and CIs for computed parameter(s)

Author(s)

Satpal Sandhu satpal.sandhu@bristol.ac.uk

References

There are no references for Rd macro ⁠\insertAllCites⁠ on this help page.

Examples



# Fit Bayesian SITAR model 

# To avoid mode estimation which takes time, the Bayesian SITAR model fit to 
# the 'berkeley_exdata' has been saved as an example fit ('berkeley_exfit').
# See 'bsitar' function for details on 'berkeley_exdata' and 'berkeley_exfit'.

# Check and confirm whether model fit object 'berkeley_exfit' exists
 berkeley_exfit <- getNsObject(berkeley_exfit)

model <- berkeley_exfit

marginal_comparison(model, parameter = 'apgv', draw_ids = 1)

[Package bsitar version 0.2.1 Index]