gep_by_nera {disprofas} | R Documentation |
Get points on confidence region bounds by Newton-Raphson search
Description
The function gep_by_nera()
is a function for finding points that
ideally sit on specific confidence region bounds (\textit{CRB}
) by
aid of the “Method of Lagrange Multipliers” (MLM) and by
“Newton-Raphson” (nera) optimisation. The multivariate confidence
interval for profiles with four time points, e.g., is an “ellipse”
in four dimensions.
Usage
gep_by_nera(n_p, kk, mean_diff, m_vc, ff_crit, y, max_trial, tol)
Arguments
n_p |
A positive integer that specifies the number of (time) points
|
kk |
A non-negative numeric value that specifies the scaling factor
|
mean_diff |
A vector of the mean differences between the dissolution
profiles or model parameters of the reference and the test batch(es) or
the averages of the model parameters of a specific group of batch(es)
(reference or test). It must have the length specified by the parameter
|
m_vc |
The pooled variance-covariance matrix of the dissolution
profiles or model parameters of the reference and the test batch(es) or
the variance-covariance matrix of the model parameters of a specific
group of batch(es) (reference or test). It must have the dimension
|
ff_crit |
The critical |
y |
A numeric vector of |
max_trial |
A positive integer that specifies the maximum number of Newton-Raphson search rounds to be performed. |
tol |
A non-negative numeric that specifies the accepted minimal difference between two consecutive search rounds. |
Details
The function gep_by_nera()
determines the points on the
\textit{CRB}
for each of the n_p
time points. It does so by aid
of the “Method of Lagrange Multipliers” (MLM) and by
“Newton-Raphson” (nera) optimisation, as proposed by Margaret
Connolly (Connolly 2000).
For more information, see the sections “Comparison of highly variable dissolution profiles” and “Similarity limits in terms of MSD” below.
Value
A list with the following elements is returned:
points |
A matrix with one column and |
converged |
A logical indicating whether the NR algorithm converged or not. |
points.on.crb |
A logical indicating whether the points found by the NR
algorithm sit on the sit on the confidence region bounds ( |
n.trial |
Number of trials until convergence. |
max.trial |
Maximal number of trials. |
tol |
A non-negative numeric value that specifies the accepted minimal difference between two consecutive search rounds, i.e. the tolerance. |
Comparison of highly variable dissolution profiles
When comparing the dissolution data of a post-approval change product and a
reference approval product, the goal is to assess the similarity between the
mean dissolution values at the observed sample time points. A widely used
method is the f_2
method that was introduced by Moore & Flanner (1996).
Similarity testing criteria based on f_2
can be found in several FDA
guidelines and in the guideline of the European Medicines Agency (EMA)
“On the investigation of bioequivalence” (EMA 2010).
In situations where within-batch variation is greater than 15%, FDA
guidelines recommend use of a multivariate confidence interval as an
alternative to the f_2
method. This can be done using the following
stepwise procedure:
Establish a similarity limit in terms of “Multivariate Statistical Distance” (MSD) based on inter-batch differences in % drug release from reference (standard approved) formulations, i.e. the so-called “Equivalence Margin” (EM).
Calculate the MSD between test and reference mean dissolutions.
Estimate the 90% confidence interval (CI) of the true MSD as determined in step 2.
Compare the upper limit of the 90% CI with the similarity limit determined in step 1. The test formulation is declared to be similar to the reference formulation if the upper limit of the 90% CI is less than or equal to the similarity limit.
Similarity limits in terms of MSD
For the calculation of the “Multivariate Statistical Distance” (MSD), the procedure proposed by Tsong et al. (1996) can be considered as well-accepted method that is actually recommended by the FDA. According to this method, a multivariate statistical distance, called Mahalanobis distance, is used to measure the difference between two multivariate means. This distance measure is calculated as
D_M = \sqrt{ \left( \bm{x}_T - \bm{x}_R \right)^{\top}
\bm{S}_{pooled}^{-1} \left( \bm{x}_T - \bm{x}_R \right)} ,
where \bm{S}_{pooled}
is the sample variance-covariance matrix pooled
across the comparative groups, \bm{x}_T
and \bm{x}_R
are the vectors of the sample means for the test (T
) and reference
(R
) profiles, and \bm{S}_T
and \bm{S}_R
are the
variance-covariance matrices of the test and reference profiles. The pooled
variance-covariance matrix \bm{S}_{pooled}
is calculated
by
\bm{S}_{pooled} = \frac{(n_R - 1) \bm{S}_R + (n_T - 1) \bm{S}_T}{%
n_R + n_T - 2} .
In order to determine the similarity limits in terms of the MSD, i.e. the Mahalanobis distance between the two multivariate means of the dissolution profiles of the formulations to be compared, Tsong et al. (1996) proposed using the equation
D_M^{max} = \sqrt{ \bm{d}_g^{\top} \bm{S}_{pooled}^{-1} \bm{d}_g} ,
where \bm{d}_g
is a 1 \times p
vector with all
p
elements equal to an empirically defined limit \bm{d}_g
,
e.g., 15
%, for the maximum tolerable difference at all time points,
and p
is the number of sampling points. By assuming that the data
follow a multivariate normal distribution, the 90% confidence region
(\textit{CR}
) bounds for the true difference between the mean vectors,
\bm{\mu}_T - \bm{\mu}_R
, can be computed for the
resultant vector \bm{\mu}
to satisfy the following condition:
\bm{\textit{CR}} = K \left( \bm{\mu} - \left( \bm{x}_T -
\bm{x}_R \right) \right)^{\top} \bm{S}_{pooled}^{-1} \left( \bm{\mu} -
\left( \bm{x}_T - \bm{x}_R \right) \right) \leq
F_{p, n_T + n_R - p - 1, 0.9} ,
where K
is the scaling factor that is calculated as
K = \frac{n_T n_R}{n_T + n_R} \; \frac{n_T + n_R - p - 1}{
\left( n_T + n_R - 2 \right) p} ,
and F_{p, n_T + n_R - p - 1, 0.9}
is the 90^{th}
percentile of
the F
distribution with degrees of freedom p
and
n_T + n_R - p - 1
, where n_T
and n_R
are the number of
observations of the reference and the test group, respectively, and p
is the number of sampling or time points, as mentioned already. It is
obvious that (n_T + n_R)
must be greater than (p + 1)
. The
formula for \textit{CR}
gives a p
-variate 90% confidence region
for the possible true differences.
References
United States Food and Drug Administration (FDA). Guidance for industry:
dissolution testing of immediate release solid oral dosage forms. 1997.
https://www.fda.gov/media/70936/download
United States Food and Drug Administration (FDA). Guidance for industry:
immediate release solid oral dosage form: scale-up and post-approval
changes, chemistry, manufacturing and controls, in vitro dissolution
testing, and in vivo bioequivalence documentation (SUPAC-IR). 1995.
https://www.fda.gov/media/70949/download
European Medicines Agency (EMA), Committee for Medicinal Products for Human Use (CHMP). Guideline on the Investigation of Bioequivalence. 2010; CPMP/EWP/QWP/1401/98 Rev. 1..
Moore, J.W., and Flanner, H.H. Mathematical comparison of curves with an emphasis on in-vitro dissolution profiles. Pharm Tech. 1996; 20(6): 64-74.
Tsong, Y., Hammerstrom, T., Sathe, P.M., and Shah, V.P. Statistical
assessment of mean differences between two dissolution data sets.
Drug Inf J. 1996; 30: 1105-1112.
doi:10.1177/009286159603000427
Connolly, M. SAS(R) IML Code to calculate an upper confidence limit for
multivariate statistical distance; 2000; Wyeth Lederle Vaccines, Pearl River,
NY.
https://analytics.ncsu.edu/sesug/2000/p-902.pdf
See Also
check_point_location
, mimcr
,
bootstrap_f2
.
Examples
# Collecting the required information
time_points <- suppressWarnings(as.numeric(gsub("([^0-9])", "",
colnames(dip1))))
tcol <- which(!is.na(time_points))
b1 <- dip1$type == "R"
# Hotelling's T2 statistics
l_hs <- get_T2_two(m1 = as.matrix(dip1[b1, tcol]),
m2 = as.matrix(dip1[!b1, tcol]),
signif = 0.05)
# Calling gep_by_nera()
res <- gep_by_nera(n_p = as.numeric(l_hs[["Parameters"]]["df1"]),
kk = as.numeric(l_hs[["Parameters"]]["K"]),
mean_diff = l_hs[["means"]][["mean.diff"]],
m_vc = l_hs[["S.pool"]],
ff_crit = as.numeric(l_hs[["Parameters"]]["F.crit"]),
y = rep(1, times = l_hs[["Parameters"]]["df1"] + 1),
max_trial = 100, tol = 1e-9)
# Expected result in res[["points"]]
# [,1]
# t.5 -15.7600077
# t.10 -13.6501734
# t.15 -11.6689469
# t.20 -9.8429369
# t.30 -6.6632182
# t.60 -0.4634318
# t.90 2.2528551
# t.120 3.3249569
# -17.6619995
# Rows t.5 to t.120 represent the points on the CR bounds.The unnamed last row
# represents the Lagrange multiplier lambda.
# If 'max_trial' is too small, the Newton-Raphson search may not converge.
## Not run:
tryCatch(
gep_by_nera(n_p = as.numeric(l_hs[["Parameters"]]["df1"]),
kk = as.numeric(l_hs[["Parameters"]]["K"]),
mean_diff = l_hs[["means"]][["mean.diff"]],
m_vc = l_hs[["S.pool"]],
ff_crit = as.numeric(l_hs[["Parameters"]]["F.crit"]),
y = rep(1, times = l_hs[["Parameters"]]["df1"] + 1),
max_trial = 5, tol = 1e-9),
warning = function(w) message(w),
finally = message("\nMaybe increasing the number of max_trial could help."))
## End(Not run)