R: Plot growth curves

plot_curves.bgmfit {bsitar}

R Documentation

Plot growth curves

Description

The plot_curves() provides visualization of six different types of growth curves that are plotted by using the ggplot2 package. The plot_curves() also allows users to make their own detailed plots from the data returned as a data.frame.

Usage

## S3 method for class 'bgmfit'
plot_curves(
  model,
  opt = "dv",
  apv = FALSE,
  bands = NULL,
  conf = 0.95,
  resp = NULL,
  ndraws = NULL,
  draw_ids = NULL,
  newdata = NULL,
  summary = TRUE,
  digits = 2,
  re_formula = NULL,
  numeric_cov_at = NULL,
  aux_variables = NULL,
  levels_id = NULL,
  avg_reffects = NULL,
  ipts = 10,
  deriv_model = TRUE,
  xrange = NULL,
  xrange_search = NULL,
  takeoff = FALSE,
  trough = FALSE,
  acgv = FALSE,
  acgv_velocity = 0.1,
  seed = 123,
  estimation_method = "fitted",
  allow_new_levels = FALSE,
  sample_new_levels = "uncertainty",
  incl_autocor = TRUE,
  robust = FALSE,
  future = FALSE,
  future_session = "multisession",
  cores = NULL,
  trim = 0,
  layout = "single",
  linecolor = NULL,
  linecolor1 = NULL,
  linecolor2 = NULL,
  label.x = NULL,
  label.y = NULL,
  legendpos = NULL,
  linetype.apv = NULL,
  linewidth.main = NULL,
  linewidth.apv = NULL,
  linetype.groupby = NA,
  color.groupby = NA,
  band.alpha = NULL,
  show_age_takeoff = TRUE,
  show_age_peak = TRUE,
  show_age_cessation = TRUE,
  show_vel_takeoff = FALSE,
  show_vel_peak = FALSE,
  show_vel_cessation = FALSE,
  returndata = FALSE,
  returndata_add_parms = FALSE,
  parms_eval = FALSE,
  idata_method = NULL,
  parms_method = "getPeak",
  verbose = FALSE,
  fullframe = NULL,
  dummy_to_factor = NULL,
  expose_function = FALSE,
  usesavedfuns = NULL,
  clearenvfuns = NULL,
  envir = NULL,
  ...
)

plot_curves(model, ...)

Arguments

`model`	An object of class `bgmfit`.
`opt`	A character string containing letter(s) corresponding to the following plotting options: 'd' for population average distance curve, 'v' for population average velocity curve, 'D' for individual-specific distance curves, 'V' for individual-specific velocity curves, 'u' for unadjusted individual-specific distance curves, and 'a' for adjusted individual-specific distance curves (adjusted for the random effects). Options 'd' and 'D' can not be specified simultaneously. Likewise, Options 'v' and 'V' can not be specified simultaneously. All other combinations are allowed. For example, dvau', Dvau', dVau', DVau', or dvau'.
`apv`	An optional logical (default `FALSE`) specifying whether or not to calculate and plot the age at peak velocity (APGV) when `opt`) includes 'v' or 'V'.
`bands`	A character string containing letter(s), or `NULL` (default) to indicate if CI bands to be plotted around the distance and velocity curves (and also the APGV). If `NULL`, no band plotted. Alternatively, user can specify a string with any one of the following or their combination(s): `'d'` for band around the distance curve, `'v} for band around the velocity curve, and \code{'p` for band around the the vertical line denoting the APGV parameter. The `'dvp'` will include CI bands for distance and velocity curves, and the APGV.
`conf`	A numeric value (default `0.95`) to be used to compute the CI and hence the width of the `bands`. See `growthparameters()` for further details.
`resp`	A character string (default `NULL`) to specify response variable when processing posterior draws for the `univariate_by` and `multivariate` models. See `bsitar()` for details on `univariate_by` and `multivariate` models
`ndraws`	A positive integer indicating the number of posterior draws to be used in estimation. If `NULL` (default), all draws are used.
`draw_ids`	An integer indicating the specific posterior draw(s) to be used in estimation (default `NULL`).
`newdata`	An optional data frame to be used in estimation. If `NULL` (default), the `newdata` is retrieved from the `model`.
`summary`	A logical indicating whether only the estimate should be computed (`TRUE`, default), or estimate along with SE and CI should be returned (`FALSE`). Setting `summary` as `FALSE` will increase the computation time.
`digits`	An integer (default `2`) to set the decimal argument for the `base::round()` function.
`re_formula`	Option to indicate whether or not to include the individual/group-level effects in the estimation. When `NA` (default), the individual-level effects are excluded and therefore population average growth parameters are computed. When `NULL`, individual-level effects are included in the computation and hence the growth parameters estimates returned are individual-specific. In both situations, (i.e,, `NA` or `NULL`), continuous and factor covariate(s) are appropriately included in the estimation. The continuous covariates by default are set to their means (see `numeric_cov_at` for details) whereas factor covariates are left unaltered thereby allowing estimation of covariate specific population average and individual-specific growth parameter.
`numeric_cov_at`	An optional (named list) argument to specify the value of continuous covariate(s). The default `NULL` option set the continuous covariate(s) at their mean. Alternatively, a named list can be supplied to manually set these values. For example, `numeric_cov_at = list(xx = 2)` will set the continuous covariate varibale 'xx' at 2. The argument `numeric_cov_at` is ignored when no continuous covariate is included in the model.
`aux_variables`	An optional argument to specify the variables to be passed to the `ipts` argument. This is useful when fitting location scale models and the measurement error models.
`levels_id`	An optional argument to specify the `ids` for hierarchical model (default `NULL`). It is used only when model is applied to the data with 3 or more levels of hierarchy. For a two level model, the `levels_id` is automatically inferred from the model fit. Even for 3 or higher level model, the `levels_id` is inferred from the model fit but under the assumption that hierarchy is specified from lowest to upper most level i.e, `id` followed by `study` where `id` is nested within the `study` Note that it is not guaranteed that the `levels_id` is sorted correctly, and therefore it is better to set it manually when fitting a model with three or more levels of hierarchy.
`avg_reffects`	An optional argument (default `NULL`) to calculate (marginal/average) curves and growth parameters such as APGV and PGV. If specified, it must be a named list indicating the `over` (typically level 1 predictor, such as age), `feby` (fixed effects, typically a factor variable), and `reby` (typically `NULL` indicating that parameters are integrated over the random effects) such as `avg_reffects = list(feby = 'study', reby = NULL, over = 'age')`.
`ipts`	An integer to set the length of the predictor variable to get a smooth velocity curve. The `NULL` will return original values whereas an integer such as `ipts = 10` (default) will interpolate the predictor. It is important to note that these interpolations do not alter the range of predictor when calculating population average and/or the individual specific growth curves.
`deriv_model`	A logical to specify whether to estimate velocity curve from the derivative function, or the differentiation of the distance curve. The argument `deriv_model` is set to `TRUE` for those functions which need velocity curve such as `growthparameters()` and `plot_curves()`, and `NULL` for functions which explicitly use the distance curve (i.e., fitted values) such as `loo_validation()` and `plot_ppc()`.
`xrange`	An integer to set the predictor range (i.e., age) when executing the interpolation via `ipts`. The default `NULL` sets the individual specific predictor range whereas code `xrange = 1` sets identical range for individuals within the same higher grouping variable (e.g., study). Code `xrange = 2` sets the identical range across the entire sample. Lastly, a paired numeric values can be supplied e.g., `xrange = c(6, 20)` to set the range within those values.
`xrange_search`	A vector of length two, or a character string `'range'` to set the range of predictor variable (`x` ) within which growth parameters are searched. This is useful when there is more than one peak and user wants to summarize peak within a given range of the `x` variable. Default `xrange_search = NULL`.
`takeoff`	A logical (default `FALSE`) to indicate whether or not to calculate the age at takeoff velocity (ATGV) and the takeoff growth velocity (TGV) parameters.
`trough`	A logical (default `FALSE`) to indicate whether or not to calculate the age at cessation of growth velocity (ACGV) and the cessation of growth velocity (CGV) parameters.
`acgv`	A logical (default `FALSE`) to indicate whether or not to calculate the age at cessation of growth velocity from the velocity curve. If `TRUE`, age at cessation of growth velocity (ACGV) and the cessation growth velocity (CGV) are calculated based on the percentage of the peak growth velocity as defined by the `acgv_velocity` argument (see below). The `acgv_velocity` is typically set at 10 percent of the peak growth velocity. The ACGV and CGV are calculated along with the the uncertainty (SE and CI) around the ACGV and CGV parameters.
`acgv_velocity`	Specify the percentage of the peak growth velocity to be used when estimating `acgv`. The default value is `0.10` i.e., 10 percent of the peak growth velocity.
`seed`	An integer (default `123`) that is passed to the estimation method.
`estimation_method`	A character string to specify the estimation method when calculating the velocity from the posterior draws. The `'fitted'` method internally calls the `fitted_draws()` whereas the option `predict` calls the `predict_draws()`. See `brms::fitted.brmsfit()` and `brms::predict.brmsfit()` for derails.
`allow_new_levels`	A flag indicating if new levels of group-level effects are allowed (defaults to `FALSE`). Only relevant if `newdata` is provided.
`sample_new_levels`	Indicates how to sample new levels for grouping factors specified in `re_formula`. This argument is only relevant if `newdata` is provided and `allow_new_levels` is set to `TRUE`. If `"uncertainty"` (default), each posterior sample for a new level is drawn from the posterior draws of a randomly chosen existing level. Each posterior sample for a new level may be drawn from a different existing level such that the resulting set of new posterior draws represents the variation across existing levels. If `"gaussian"`, sample new levels from the (multivariate) normal distribution implied by the group-level standard deviations and correlations. This options may be useful for conducting Bayesian power analysis or predicting new levels in situations where relatively few levels where observed in the old_data. If `"old_levels"`, directly sample new levels from the existing levels, where a new level is assigned all of the posterior draws of the same (randomly chosen) existing level.
`incl_autocor`	A flag indicating if correlation structures originally specified via `autocor` should be included in the predictions. Defaults to `TRUE`.
`robust`	A logical to specify the summarize options. If `FALSE` (the default) the mean is used as the measure of central tendency and the standard deviation as the measure of variability. If `TRUE`, the median and the median absolute deviation (MAD) are applied instead. Ignored if `summary` is `FALSE`.
`future`	A logical (default `FALSE`) to specify whether or not to perform parallel computations. If set to `TRUE`, the `future.apply::future_sapply()` function is used to summarize draws.
`future_session`	A character string to set the session type when `future = TRUE`. The `'multisession'` (default) options sets the multisession whereas the `'multicore'` sets the multicore session. Note that option `'multicore'` is not supported on Windows systems. For more details, see `future.apply::future_sapply()`.
`cores`	Number of cores to be used when running the parallel computations (if `future = TRUE`). On non-Windows systems this argument can be set globally via the mc.cores option. For the default `NULL` option, the number of cores are set automatically by calling the `future::availableCores()`. The number of cores used are the maximum number of cores avaialble minus one, i.e., `future::availableCores() - 1`.
`trim`	A number (default 0) of long line segments to be excluded from plot with option 'u' or 'a'. See sitar::plot.sitar for details.
`layout`	A character string defining the layout structure of the plot. A `'single'` (default) layout provides overlaid distance and velocity curves on a single plot when opt includes `'dv'`, `'Dv'`, `'dV'` or `'DV'` options. Similarly, when opt includes `'au'`, the adjusted and unadjusted curves are plotted as a single plot. When opt is a single letter (e.g., `'d'`. `'v'` `'D'`, `'V'`, `'a'`, `'u'`), the `'single'` optiion is ignored. The alternative layout option, the `'facet'` uses the `facet_wrap` from the `ggplot2`. to map and draw plot when `opt` include two or more letters.
`linecolor`	The color of line used when layout is `'facet'`. The default is `NULL` which internally set the `linecolor` as `'grey50'`.
`linecolor1`	The color of first line when layout is `'single'`. For example, for `opt = 'dv'`, the color of distance line is controlled by the `linecolor1`. Default `NULL` will internally set `linecolor1` as `'orange2'`.
`linecolor2`	The color of second line when layout is `'single'`. For example, for `opt = 'dv'`, the color of velocity line is controlled by the `linecolor2`. Default `NULL` sets the color `'green4'` for `linecolor2`.
`label.x`	An optional character string to label the x axis. When `NULL` (default), the x axis label is taken from the predictor (e.g., age).
`label.y`	An optional character string to label the y axis. When `NULL` (default), the y axis label is taken from the type of plot (e.g., distance, velocity etc.). Note that when layout option is `'facet'`, then y axis label is removed and instead the same label is used as a title.
`legendpos`	An optional character string to specify the position of legends. When `NULL` (default), the legend position is set as 'bottom' for distance and velocity curves with `'single'` layout option for the population average curves, and `'none'` for the individual specific curves. The `'none'` suppress all legends that helps in avoiding printing legends for each individual.
`linetype.apv`	An optional character string to specify the type of the vertical line drawn to mark the APGV. Default `NULL` sets the linetype as `dotted`.
`linewidth.main`	An optional character string to specify the width of the the line for the distance and velocity curves. The default `NULL` will set it as 0.35.
`linewidth.apv`	An optional character string to specify the width of the the vertical line drawn to mark the APGV. The default `NULL` will set it as 0.25.
`linetype.groupby`	An optional argument to specify the line type for the distance and velocity curves when drawing plots for a model that includes factor covariate(s) or when visualising individual specific distance/velocity curves (default `NA`). Setting it to `NULL` will automatically sets the linetype for each factor level or individual This will also add legends for the factor level covariate or individuals whereas `NA` will set a 'solid' line type and suppress legends. It is recommended to keep the default `NULL` option when plotting population average curves for when model included factor covariates because this would appropriately set the legends otherwise it is difficult to differentiate which curve belongs to which level of factor. For individual specific curves, the line type can be set to `NULL` when the number of individuals is small. However, when the number of individuals is large, `NA` is a better choice which prevents printing a large number of legends for each individual.
`color.groupby`	An optional argument to specify the line color for distance and velocity curves when drawing plots for a model that includes factor covariate(s), or when visualising individual specific distance/velocity curves (default `NA`). Setting it to `NULL` will automatically sets the line color for each factor level or individual. This will also add legends for the factor level covariate or individuals. However, setting it as `NA` will set a 'solid' line type and suppress legends. It is recommended to keep the default `NULL` option when plotting population average curves for factor covariates because this would appropriately set the legends otherwise it is difficult to differentiate which curve belongs to which level of the factor. For individual specific curves, the line color can be set to `NULL` when the number of individuals is small. However, when the number of individuals is large, `NA` is a better choice which prevents printing a large number of legends for each individual.
`band.alpha`	An optional numeric value to specify the transparency of the CI band(s) around the distance curve, velocity curve and the line indicating the APGV. The default `NULL` will set this value to 0.4.
`show_age_takeoff`	A logical (default `TRUE`) to indicate whether to display the ATGV line(s) on the plot.
`show_age_peak`	A logical (default `TRUE`) to indicate whether to display the APGV line(s) on the plot.
`show_age_cessation`	A logical (default `TRUE`) to indicate whether to display the ACGV line(s) on the plot.
`show_vel_takeoff`	A logical (default `FALSE`) to indicate whether to display the TGV line(s) on the plot.
`show_vel_peak`	A logical (default `FALSE`) to indicate whether to display the PGV line(s) on the plot.
`show_vel_cessation`	A logical (default `FALSE`) to indicate whether to display the CGV line(s) on the plot.
`returndata`	A logical (default `FALSE`) indicating whether to plot the data or return the data. If `TRUE`, the data is returned as a `data.frame`.
`returndata_add_parms`	A logical (default `FALSE`) indicating whether add growth parameters to the `returndata`. The `returndata_add_parms` is ignored when `returndata = FALSE`. If `TRUE`, the growth parameters such as `APGV` and `PGV` are added to the returned `data.frame`. Note that growth parameters are estimated only when `'opt'` argument include either `'v'` or `'V'` option and the argument `'apv'` is set to `TRUE`. If any of these conditions are missing, then `returndata_add_parms` will ignored ignored.
`parms_eval`	A logical to specify whether or not to get growth parameters on the fly. This is for internal use only and mainly needed for compatibility across internal functions.
`idata_method`	A character string to indicate the interpolation method. The number of of interpolation points is set up the `ipts` argument. Options available for `idata_method` are method 1 (specified as `'m1'`) and method 2 (specified as `'m2'`). The method 1 (`'m1'`) is adapted from the the iapvbs package and is documented here https://rdrr.io/github/Zhiqiangcao/iapvbs/src/R/exdata.R whereas method 2 (`'m2'`) is based on the JMbayes package as documented here https://github.com/drizopoulos/JMbayes/blob/master/R/dynPred_lme.R. The `'m1'` method works by internally constructing the data frame based on the model configuration whereas the method `'m2'` uses the exact data frame used in model fit and can be accessed via `fit$data`. If `idata_method = NULL, default`, then method `'m2'` is automatically set. Note that method `'m1'` might fail in some cases when model involves covariates particularly when model is fit as `univariate_by`. Therefore, it is advised to switch to method `'m2'` in case `'m1'` results in error.
`parms_method`	A character to specify the method used to when evaluating `parms_eval`. The default is `getPeak` which uses the `sitar::getPeak()` function from the `sitar` package. The alternative option is `findpeaks` that uses the `pracma::findpeaks()` function function from the `pracma` package. This is for internal use only and mainly needed for compatibility across internal functions.
`verbose`	An optional argument (logical, default `FALSE`) to indicate whether to print information collected during setting up the object(s).
`fullframe`	A logical to indicate whether to return `fullframe` object in which `newdata` is bind to the summary estimates. Note that `fullframe` can not be combined with `summary = FALSE`. Furthermore, `fullframe` can only be used when `idata_method = 'm2'`. A particular use case is when fitting `univariate_by` model. The `fullframe` is mainly for internal use only.
`dummy_to_factor`	A named list (default `NULL`) that is used to convert dummy variables into a factor variable. The named elements are `factor.dummy`, `factor.name`, and `factor.level`. The `factor.dummy` is a vector of character strings that need to be converted to a factor variable whereas the `factor.name` is a single character string that is used to name the newly created factor variable. The `factor.level` is used to name the levels of newly created factor. When `factor.name` is `NULL`, then the factor name is internally set as `'factor.var'`. If `factor.level` is `NULL`, then names of factor levels are take from the `factor.dummy` i.e., the factor levels are assigned same name as `factor.dummy`. Note that when `factor.level` is not `NULL`, its length must be same as the length of the `factor.dummy`.
`expose_function`	An optional logical argument to indicate whether to expose Stan functions (default `FALSE`). Note that if user has already exposed Stan functions during model fit by setting `expose_function = TRUE` in the `bsitar()`, then those exposed functions are saved and can be used during post processing of the posterior draws and therefore `expose_function` is by default set as `FALSE` in all post processing functions except `optimize_model()`. For `optimize_model()`, the default setting is `expose_function = NULL`. The reason is that each optimized model has different Stan function and therefore it need to be re exposed and saved. The `expose_function = NULL` implies that the setting for `expose_function` is taken from the original `model` fit. Note that `expose_function` must be set to `TRUE` when adding `fit criteria` and/or `bayes_R2` during model optimization.
`usesavedfuns`	A logical (default `NULL`) to indicate whether to use the already exposed and saved `Stan` functions. Depending on whether the user have exposed Stan functions within the `bsitar()` call via `expose_functions` argument in the `bsitar()`, the `usesavedfuns` is automatically set to `TRUE` (if `expose_functions = TRUE`) or `FALSE` (if `expose_functions = FALSE`). Therefore, manual setting of `usesavedfuns` as `TRUE`/`FALSE` is rarely needed. This is for internal purposes only and mainly used during the testing of the functions and therefore should not be used by users as it might lead to unreliable estimates.
`clearenvfuns`	A logical to indicate whether to clear the exposed function from the environment (`TRUE`) or not (`FALSE`). If `NULL` (default), then `clearenvfuns` is set as `TRUE` when `usesavedfuns` is `TRUE`, and `FALSE` if `usesavedfuns` is `FALSE`.
`envir`	Environment used for function evaluation. The default is `NULL` which will set `parent.frame()` as default environment. Note that since most of post processing functions are based on brms, the functions needed for evaluation should be in the `.GlobalEnv`. Therefore, it is strongly recommended to set `envir = globalenv()` (or `envir = .GlobalEnv`). This is particularly true for the derivatives such as velocity curve.
`...`	Further arguments passed to `brms::fitted.brmsfit()` and `brms::predict()` functions.

Details

The plot_curves() is a generic function that allows visualization of following six curves: population average distance curve, population average velocity curve, individual-specific distance curves, individual-specific velocity curves, unadjusted individual growth curves (i.e, observed growth curves), and the adjusted individual growth curves (adjusted for the model estimated random effects). The plot_curves() internally calls the growthparameters() function to estimate and summaries the distance and velocity curves and to estimate growth parameters such as the age at peak growth velocity (APGV). The plot_curves() in turn calls the fitted_draws() or the predict_draws() functions to make inference from the posterior draws. Thus, plot_curves() allows plotting fitted or predicted curves. See fitted_draws() and predict_draws() for details on these functions and the difference between fitted and predicted values.

Value

A plot object (default), or a data.frame when returndata = TRUE.

Author(s)

Satpal Sandhu satpal.sandhu@bristol.ac.uk

Examples


# Fit Bayesian SITAR model 

# To avoid mode estimation which takes time, the Bayesian SITAR model fit to 
# the 'berkeley_exdata' has been saved as an example fit ('berkeley_exfit').
# See 'bsitar' function for details on 'berkeley_exdata' and 'berkeley_exfit'.

# Check and confirm whether model fit object 'berkeley_exfit' exists
 berkeley_exfit <- getNsObject(berkeley_exfit)

model <- berkeley_exfit

# Population average distance and velocity curves with default options
plot_curves(model, opt = 'dv')


# Individual-specific distance and velocity curves with default options
# Note that legendpos = 'none' will suppress the legend positions. This   
# suppression is useful when plotting individual-specific curves

plot_curves(model, opt = 'DV')

# Population average distance and velocity curves with APGV

plot_curves(model, opt = 'dv', apv = TRUE)

# Individual-specific distance and velocity curves with APGV

plot_curves(model, opt = 'DV', apv = TRUE)

# Population average distance curve, velocity curve, and APGV with CI bands
# To construct CI bands, growth parameters are first calculated for each  
# posterior draw and then summarized across draws. Therefore,summary 
# option must be set to FALSE

plot_curves(model, opt = 'dv', apv = TRUE, bands = 'dvp', summary = FALSE)

# Adjusted and unadjusted individual curves
# Note ipts = NULL (i.e., no interpolation of predictor (i.e., age) to plot a 
# smooth curve). This is because it does not a make sense to interploate data 
# when estimating adjusted curves. Also, layout = 'facet' (and not default 
# layout = 'single') is used for the ease of visualizing the plotted 
# adjusted and unadjusted individual curves. However, these lines can be 
# superimposed on each other by setting the set layout = 'single'.
# For other plots shown above, layout can be set as 'single' or 'facet'

# Separate plots for adjusted and unadjusted curves (layout = 'facet')
plot_curves(model, opt = 'au', ipts = NULL, layout = 'facet')

# Superimposed adjusted and unadjusted curves (layout = 'single')
plot_curves(model, opt = 'au', ipts = NULL, layout = 'single')

[Package bsitar version 0.2.1 Index]