hgp {BayesDissolution}R Documentation

Hierarchical Gaussian Process Model for Dissolution Profile Modeling


This function implements the Bayesian hierarchical Gaussian process model described in Pourmohamad et al (2022).


  B = 1000,
  n_interp = 30,
  control = list(),
  adaptive = FALSE



A data frame containing the dissolution data. The first column of the data frame should denote the group labels identifying whether a given dissolution belongs to the "reference" or "test" formulation group. For a given dissolution run, the remaining columns of the data frame contains the individual run's dissolution measurements sorted in time.


A vector in ascending order that corresponds to each time point the dissolution data was measured at.


A positive integer specifying the number of posterior samples to draw. By default B is set to 10000.


An integer value specifying the number of time points to interpolate at. This sets the interploated points to be to seq(1st time point, last time point, length = n_interp).


An optional list of priors and initial values, otherwise the default values/strategies found in Pourmohamad et al (2022) will be used. More specifically, control can be used to define the following settings:

  • sigma2_starting: starting value for sigma^2

  • tau2_starting: starting value for tau^2

  • phi_starting: starting value for phi

  • psi_starting: starting value for psi

  • sigma2_alpha and sigma2_beta: parameters for the inverse gamma prior for sigma^2

  • tau2_alpha and tau2_beta: parameters for the inverse gamma prior for tau^2

  • phi_alpha and phi_beta: parameters for the gamma prior for phi

  • psi_alpha and psi_beta: parameters for the gamma prior for psi

  • prop_phi: proposal variance for the parameter phi

  • prop_psi: proposal variance for the parameter psi


logical; an option for using adaptive MCMC. If adaptive = TRUE, this will replace both prop_phi and prop_psi by using past MCMC draws to inform the proposal variance.


The function returns a list of summary statistics and B posterior samples for parameters of the model. More specifically it returns:


You should always check MCMC diagnostics on the posterior samples before drawing conclusions. Likewise, plots of the predicted dissolution curves should also be checked to evaluate if the model fit to the observed data seems reasonable.


Novick, S., Shen, Y., Yang, H., Peterson, J., LeBlond, D., and Altan, S. (2015). Dissolution Curve Comparisons Through the F2 Parameter, a Bayesian Extension of the f2 Statistic. Journal of Biopharmaceutical Statistics, 25(2):351-371.

Pourmohamad, T., Oliva Aviles, C.M., and Richardson, R. Gaussian Process Modeling for Dissolution Curve Comparisons. Journal of the Royal Statistical Society, Series C, 71(2):331-351.


### dis_data comes loaded with the package
### We set B = 100 to obtain 100 posterior samples, you probably want to run it
### longer for say, B = 100000, but B = 100 runs fast for illustrative purposes
### and passing CRAN checks
B <- 100

tp <- seq(10, 80, 10) # Time points
burnin <- B * 0.1     # A 10% burn-in
thin <- 10            # Keep every 10th sample, i.e., thinning
post <- hgp(dis_data, locs = tp, B = B, n_interp = 100)

### Example: Removing burn-in and then thinning the posterior samples for the covariance parameters
###          and then plotting the chains
phi <- post$mcmc_chains$cov_pars$phi[-c(1:burnin)]
phi <- phi[seq(1, (B-burnin), thin)]
psi <- post$mcmc_chains$cov_pars$psi[-c(1:burnin)]
psi <- psi[seq(1, (B-burnin), thin)]
sigma_R <- post$mcmc_chains$cov_pars$sigma_R[-c(1:burnin)]
sigma_R <- sigma_R[seq(1, (B-burnin), thin)]
sigma_T <- post$mcmc_chains$cov_pars$sigma_T[-c(1:burnin)]
sigma_T <- sigma_T[seq(1, (B-burnin), thin)]
tau <- post$mcmc_chains$cov_pars$tau[-c(1:burnin)]
tau <- tau[seq(1, (B-burnin), thin)]

chains <- data.frame( # Data frame holding posterior samples
samples = rep(1:((B-burnin)/thin), times = 5),
parameter = rep(c("phi", "psi", "sigma_R", "sigma_T", "tau"),
                each = (B-burnin)/thin),
values = c(phi, psi, sigma_R, sigma_T, tau))
chains$parameter <- factor(chains$parameter,
                           labels = c(expression(phi),
ggplot2::ggplot(chains, ggplot2::aes(samples, values)) +
 ggplot2::geom_line() +
 ggplot2::labs(x = "Iterations", y = "Posterior Sample Values") +
 ggplot2::facet_wrap(~parameter, scales = "free",
             labeller = ggplot2::label_parsed) +
 ggplot2::theme(text = ggplot2::element_text(size = 22))

ggplot2::ggplot(chains, ggplot2::aes(values)) +
 ggplot2::geom_density() +
 ggplot2::labs(x = "Values", y = "Posterior Density") +
 ggplot2::facet_wrap(~parameter, scales = "free",
            labeller = ggplot2::label_parsed) +
 ggplot2::theme(text = ggplot2::element_text(size = 22))

### Plotting the predicted dissolution profiles
dissplot(dis_data, tp)
grid <- sort(c(tp, seq(min(tp), max(tp), length = 100)))
grid1 <- (1:B)[-(1:burnin)][seq(1, (B-burnin), thin)]
grid2 <- ((B+1):(2*B))[-(1:burnin)][seq(1, (B-burnin), thin)]
lines(grid, apply(post$mcmc_chains$mu_pars[,grid1], 1, mean),
      col = "gray65", lwd = 2)
lines(grid, apply(post$mcmc_chains$mu_pars[,grid2], 1, mean),
      col = "black", lwd = 2)
lower <- apply(post$mcmc_chains$mu_pars[,grid1], 1,
               quantile, prob = 0.025)
upper <- apply(post$mcmc_chains$mu_pars[,grid1], 1,
               quantile, prob = 0.975)
polygon(c(grid, rev(grid)), c(lower, rev(upper)),
        col = scales::alpha("gray65",.2), border = NA)
lower <- apply(post$mcmc_chains$mu_pars[,grid2], 1,
              quantile, prob = 0.025)
upper <- apply(post$mcmc_chains$mu_pars[,grid2], 1,
               quantile, prob = 0.975)
polygon(c(grid, rev(grid)), c(lower, rev(upper)),
        col = scales::alpha("black",.2), border = NA)

### If we want to calculate the Pr(f2 > 50 & delta < 15)
prob <- sum(post$mcmc_chains$f2[grid1] > 50 &
            post$mcmc_chains$delta[grid1] < 15) / ((B - burnin)/thin)

[Package BayesDissolution version 0.1.0 Index]