ppi {scorematchingad} | R Documentation |
Estimation of Polynomially-Tilted Pairwise Interaction (PPI) Model
Description
Estimates the parameters of the Polynomially-Tilted Pairwise Interaction (PPI) model (Scealy and Wood 2023) for compositional data.
By default ppi()
uses cppad_closed()
to find estimate.
For many situations a hard-coded implementation of the score matching estimator is also available.
For a given parameter vector evalparam
, ppi_smvalues()
computes the score matching discrepancy, the gradient and the Hessian of the score matching discrepancy (see smvalues()
) and the gradient offset of the score matching discrepancy (see quadratictape_parts()
and tapeGradOffset()
).
Usage
ppi(
Y,
paramvec = NULL,
trans,
method = "closed",
w = rep(1, nrow(Y)),
constrainbeta = FALSE,
bdryw = "ones",
acut = NULL,
bdrythreshold = 1e-10,
shiftsize = bdrythreshold,
approxorder = 10,
control = list(tol = 1e-15, checkgrad = TRUE),
paramvec_start = NULL
)
ppi_smvalues(
Y,
paramvec = NULL,
evalparam,
trans,
method = "closed",
w = rep(1, nrow(Y)),
bdryw = "ones",
acut = NULL,
bdrythreshold = 1e-10,
shiftsize = bdrythreshold,
approxorder = 10,
average = TRUE
)
Arguments
Y |
A matrix of measurements. Each row is a compositional measurement (i.e. each row sums to 1 and has non-negative elements). |
paramvec |
Optionally a vector of the PPI models parameters. |
trans |
The name of the transformation of the manifold in Hyvärinen divergence (See |
method |
|
w |
Weights for each observation, if different observations have different importance. Used by |
constrainbeta |
If |
bdryw |
The boundary weight function for down weighting measurements as they approach the manifold boundary. Either "ones", "minsq" or "prodsq". See details. |
acut |
The threshold |
bdrythreshold |
|
shiftsize |
|
approxorder |
|
control |
|
paramvec_start |
|
evalparam |
The parameter set to evaluate the score matching values.
This is different to |
average |
If TRUE return the (weighted average) of the measurements, otherwise return the values for each measurement. |
Details
Estimation may be performed via transformation of the measure in Hyvärinen divergence from Euclidean space to the simplex (inverse of the additive log ratio transform), from a hyperplane to the simplex (inverse of the centred log ratio transform), from the positive quadrant of the sphere to the simplex (inverse of the square root transform), or without any transformation. In the latter two situations there is a boundary and weighted Hyvärinen divergence (Equation 7, Scealy and Wood 2023) is used. Properties of the estimator using the square root transform were studied by Scealy and Wood (2023). Properties of the estimator using the additive log ratio transform were studied by Scealy et al. (2024).
There are three boundary weight functions available:
The function "ones" applies no weights and should be used whenever the manifold does not have a boundary.
The function "minsq" is the minima-based boundary weight function for the PPI model (Equation 12, Scealy and Wood 2023)
\tilde{h}(z)^2 = \min(z_1^2, z_2^2, ..., z_p^2, a_c^2).
where
z
is a point in the positive orthant of the p-dimensional unit sphere andz_j
is the jth component of z.The function "prodsq" is the product-based (Equation 9, Scealy and Wood 2023)
\tilde{h}(z)^2 = \min(\prod_{j=1}^{p} z_j^2, a_c^2).
where
z
is a point in the positive orthant of the p-dimensional unit sphere andz_j
is the jth component of z.
Scealy and Wood (Theorem 1, Scealy and Wood 2023) prove that minimising the weighted Hyvärinen Divergence is equivalent to minimising \psi(f, f_0)
(See scorematchingtheory
)
when the boundary weight function is smooth or for the functions "minsq" and "prodsq" above when the manifold is the simplex or positive orthant of a sphere.
Hard-coded estimators are available for the following situations:
Square root transformation ("sqrt") with the "minsq" boundary weight function:
full parameter vector (
paramvec
not provided)-
paramvec
fixes only the final element of\beta
-
paramvec
fixes all elements of\beta
-
paramvec
fixesb_L = 0
and provides fixed values of\beta
-
paramvec
fixesA_L=0
andb_L=0
, leaving\beta
to be fitted.
Square root transformation ("sqrt") with the "prodsq" boundary weight function:
-
paramvec
fixes all elements of\beta
-
paramvec
fixesb_L = 0
and provides fixed values of\beta
-
paramvec
fixesA_L=0
andb_L=0
, leaving\beta
to be fitted.
-
The additive log ratio transformation ("alr") using the final component on the denominator, with
b_L=0
and fixed final component of\beta
.
Value
ppi()
returns:
A list of est
, SE
and info
.
-
est
contains the estimates in vector form,paramvec
, and asA_L
,b_L
and\beta
. -
SE
contains estimates of the standard errors if computed. Seecppad_closed()
. -
info
contains a variety of information about the model fitting procedure and results.
ppi_smvalues()
returns a list of
-
obj
the score matching discrepancy value -
grad
the gradient of the score matching discrepancy -
hess
the Hessian of the score matching discrepancy -
offset
gradient offset (seequadratictape_parts()
)
PPI Model
The PPI model density is proportional to
\exp(z_L^TA_Lz_L + b_L^Tz_L)\prod_{i=1}^p z_i^{\beta_i},
where p
is the dimension of a compositional measurement z
, and z_L
is z
without the final (p
th) component.
A_L
is a p-1 \times p-1
symmetric matrix that controls the covariance between components.
b_L
is a p-1
vector that controls the location within the simplex.
The i
th component \beta_i
of \beta
controls the concentration of density when z_i
is close to zero: when \beta_i \geq 0
there is no concentration and \beta_i
is hard to identify; when \beta_i < 0
then the probability density of the PPI model increases unboundedly as z_i
approaches zero, with the increasing occuring more rapidly and sharply the closer \beta_i
is to -1
.
References
Scealy JL, Hingee KL, Kent JT, Wood ATA (2024).
“Robust score matching for compositional data.”
Statistics and Computing, 34, 93.
doi:10.1007/s11222-024-10412-w.
Scealy JL, Wood ATA (2023).
“Score matching for compositional distributions.”
Journal of the American Statistical Association, 118(543), 1811–1823.
doi:10.1080/01621459.2021.2016422.
See Also
Other PPI model tools:
dppi()
,
ppi_param_tools
,
ppi_robust()
,
rppi()
Examples
model <- rppi_egmodel(100)
estalr <- ppi(model$sample,
paramvec = ppi_paramvec(betap = -0.5, p = ncol(model$sample)),
trans = "alr")
estsqrt <- ppi(model$sample,
trans = "sqrt",
bdryw = "minsq", acut = 0.1)