R: Hierarchical Gaussian Process Model for Dissolution Profile...

hgp {BayesDissolution}

R Documentation

Hierarchical Gaussian Process Model for Dissolution Profile Modeling

Description

This function implements the Bayesian hierarchical Gaussian process model described in Pourmohamad et al (2022).

Usage

hgp(
  dis_data,
  locs,
  B = 1000,
  n_interp = 30,
  control = list(),
  adaptive = FALSE
)

Arguments

`dis_data`	A data frame containing the dissolution data. The first column of the data frame should denote the group labels identifying whether a given dissolution belongs to the "reference" or "test" formulation group. For a given dissolution run, the remaining columns of the data frame contains the individual run's dissolution measurements sorted in time. Alternatively, the user may provide a data object of class dis_data containing the dissolution data. See the `make_dis_data()` function for the particular structure of the data object.
`locs`	A vector in ascending order that corresponds to each time point the dissolution data was measured at.
`B`	A positive integer specifying the number of posterior samples to draw. By default `B` is set to 10000.
`n_interp`	An integer value specifying the number of time points to interpolate at. This sets the interploated points to be to `seq(1st time point, last time point, length = n_interp)`.
`control`	An optional list of priors and initial values, otherwise the default values/strategies found in Pourmohamad et al (2022) will be used. More specifically, `control` can be used to define the following settings: `sigma2_starting`: starting value for sigma^2 `tau2_starting`: starting value for tau^2 `phi_starting`: starting value for phi `psi_starting`: starting value for psi `sigma2_alpha` and `sigma2_beta`: parameters for the inverse gamma prior for sigma^2 `tau2_alpha` and `tau2_beta`: parameters for the inverse gamma prior for tau^2 `phi_alpha` and `phi_beta`: parameters for the gamma prior for phi `psi_alpha` and `psi_beta`: parameters for the gamma prior for psi `prop_phi`: proposal variance for the parameter phi `prop_psi`: proposal variance for the parameter psi
`adaptive`	logical; an option for using adaptive MCMC. If `adaptive = TRUE`, this will replace both `prop_phi` and `prop_psi` by using past MCMC draws to inform the proposal variance.

Value

The function returns a list of summary statistics and B posterior samples for parameters of the model. More specifically it returns:

delta: The average delta value over the posterior samples of delta. The definition of delta is given in Novick et. al 2015.
f2: The average f2 value over the posterior samples of f2.
mcmc_chains: A list of posterior samples for delta, f2, the mean parameters (mu_pars), and the covariance parameters (cov_pars).

Note

You should always check MCMC diagnostics on the posterior samples before drawing conclusions. Likewise, plots of the predicted dissolution curves should also be checked to evaluate if the model fit to the observed data seems reasonable.

References

Novick, S., Shen, Y., Yang, H., Peterson, J., LeBlond, D., and Altan, S. (2015). Dissolution Curve Comparisons Through the F2 Parameter, a Bayesian Extension of the f2 Statistic. Journal of Biopharmaceutical Statistics, 25(2):351-371.

Pourmohamad, T., Oliva Aviles, C.M., and Richardson, R. Gaussian Process Modeling for Dissolution Curve Comparisons. Journal of the Royal Statistical Society, Series C, 71(2):331-351.

Examples

### dis_data comes loaded with the package
### We set B = 100 to obtain 100 posterior samples, you probably want to run it
### longer for say, B = 100000, but B = 100 runs fast for illustrative purposes
### and passing CRAN checks
B <- 100

tp <- seq(10, 80, 10) # Time points
burnin <- B * 0.1     # A 10% burn-in
thin <- 10            # Keep every 10th sample, i.e., thinning
post <- hgp(dis_data, locs = tp, B = B, n_interp = 100)

### Example: Removing burn-in and then thinning the posterior samples for the covariance parameters
###          and then plotting the chains
phi <- post$mcmc_chains$cov_pars$phi[-c(1:burnin)]
phi <- phi[seq(1, (B-burnin), thin)]
psi <- post$mcmc_chains$cov_pars$psi[-c(1:burnin)]
psi <- psi[seq(1, (B-burnin), thin)]
sigma_R <- post$mcmc_chains$cov_pars$sigma_R[-c(1:burnin)]
sigma_R <- sigma_R[seq(1, (B-burnin), thin)]
sigma_T <- post$mcmc_chains$cov_pars$sigma_T[-c(1:burnin)]
sigma_T <- sigma_T[seq(1, (B-burnin), thin)]
tau <- post$mcmc_chains$cov_pars$tau[-c(1:burnin)]
tau <- tau[seq(1, (B-burnin), thin)]

chains <- data.frame( # Data frame holding posterior samples
samples = rep(1:((B-burnin)/thin), times = 5),
parameter = rep(c("phi", "psi", "sigma_R", "sigma_T", "tau"),
                each = (B-burnin)/thin),
values = c(phi, psi, sigma_R, sigma_T, tau))
chains$parameter <- factor(chains$parameter,
                           labels = c(expression(phi),
                                      expression(psi),
                                      expression(sigma[R]),
                                      expression(sigma[T]),
                                      expression(tau)))
ggplot2::ggplot(chains, ggplot2::aes(samples, values)) +
 ggplot2::geom_line() +
 ggplot2::labs(x = "Iterations", y = "Posterior Sample Values") +
 ggplot2::facet_wrap(~parameter, scales = "free",
             labeller = ggplot2::label_parsed) +
 ggplot2::theme(text = ggplot2::element_text(size = 22))

ggplot2::ggplot(chains, ggplot2::aes(values)) +
 ggplot2::geom_density() +
 ggplot2::labs(x = "Values", y = "Posterior Density") +
 ggplot2::facet_wrap(~parameter, scales = "free",
            labeller = ggplot2::label_parsed) +
 ggplot2::theme(text = ggplot2::element_text(size = 22))

### Plotting the predicted dissolution profiles
dissplot(dis_data, tp)
grid <- sort(c(tp, seq(min(tp), max(tp), length = 100)))
grid1 <- (1:B)[-(1:burnin)][seq(1, (B-burnin), thin)]
grid2 <- ((B+1):(2*B))[-(1:burnin)][seq(1, (B-burnin), thin)]
lines(grid, apply(post$mcmc_chains$mu_pars[,grid1], 1, mean),
      col = "gray65", lwd = 2)
lines(grid, apply(post$mcmc_chains$mu_pars[,grid2], 1, mean),
      col = "black", lwd = 2)
lower <- apply(post$mcmc_chains$mu_pars[,grid1], 1,
               quantile, prob = 0.025)
upper <- apply(post$mcmc_chains$mu_pars[,grid1], 1,
               quantile, prob = 0.975)
polygon(c(grid, rev(grid)), c(lower, rev(upper)),
        col = scales::alpha("gray65",.2), border = NA)
lower <- apply(post$mcmc_chains$mu_pars[,grid2], 1,
              quantile, prob = 0.025)
upper <- apply(post$mcmc_chains$mu_pars[,grid2], 1,
               quantile, prob = 0.975)
polygon(c(grid, rev(grid)), c(lower, rev(upper)),
        col = scales::alpha("black",.2), border = NA)

### If we want to calculate the Pr(f2 > 50 & delta < 15)
prob <- sum(post$mcmc_chains$f2[grid1] > 50 &
            post$mcmc_chains$delta[grid1] < 15) / ((B - burnin)/thin)

[Package BayesDissolution version 0.2.1 Index]