buildsmdtape {scorematchingad}R Documentation

Build CppAD Tapes for Score Matching

Description

For a parametric model family, the function buildsmdtape() generates CppAD tapes (called ADFuns) for the improper log-likelihood (without normalising constant) of the family and the score matching discrepancy function A(z) + B(z) + C(z) (defined in scorematchingtheory). Three steps are performed by buildsmdtape(): first an object that specifies the manifold and any transformation to another manifold is created; then a tape of the log-likelihood (without normalising constant) is created; finally a tape of A(z) + B(z) + C(z) is created.

Usage

buildsmdtape(
  start,
  tran = "identity",
  end = start,
  ll,
  ytape,
  usertheta,
  bdryw = "ones",
  acut = 1,
  thetatape_creator = function(n) {     seq(length.out = n) },
  verbose = FALSE
)

Arguments

start

The starting manifold. Used for checking that tran and man match.

tran

The name of a transformation. Available transformations are

  • “sqrt”

  • “alr”

  • “clr”

  • “none” or ‘identity’

end

The name of the manifold that tran maps start to. Available manifolds are:

  • “sph” unit sphere

  • “Hn111” hyperplane normal to the vector 1, 1, 1, 1, ...

  • “sim” simplex

  • “Euc” Euclidean space

ll

The name of an inbuilt improper log-likelihood function to tape (which also specifies the parametric model family). On Linux operating systems a custom log-likelihood function created by customll() can also be used; the ll should operate on the untransformed (i.e. starting) manifold.

ytape

An example measurement value to use for creating the tapes. In the natural (i.e. start) manifold of the log-likelihood function. Please ensure that ytape is the interior of the manifold and non-zero.

usertheta

A vector of parameter elements for the likelihood function. NA elements will become dynamic parameters. Other elements will be fixed at the provided value. The length of usertheta must be the correct length for the log-likelihood - no checking is conducted.

bdryw

The name of the boundary weight function. "ones" for manifolds without boundary. For the simplex and positive orthant of the sphere, "prodsq" and "minsq" are possible - see ppi() for more information on these.

acut

A parameter passed to the boundary weight function bdryw. Ignored for bdryw = "ones".

thetatape_creator

A function that accepts an integer n, and returns a vector of n length. The function is used to fill in the NA elements of usertheta when building the tapes. Please ensure that the values filled by thetatape_creator lead to plausible parameter vectors for the chosen log-likelihood.

verbose

If TRUE more details are printed when taping. These details are for debugging and will likely be comprehensible only to users familiar with the source code of this package.

Details

The improper log-likelihood (without normalising constant) must be implemented in ⁠C++⁠ and is selected by name. Similarly the transforms of the manifold must be implemented in ⁠C++⁠ and selected by name.

When using, CppAD one first creates tapes of functions. These tapes can then be used for evaluating the function and its derivatives, and generating further tapes through argument swapping, differentiation and composition. The taping relies on specifying typical argument values for the functions (see Introduction to CppAD Tapes below). Tapes can have both independent variables and dynamic parameters. The differentiation with CppAD occurs with respect to the independent variables. Tapes of tapes are possible, including tapes that swap the independent and dynamic variables - this is how this package differentiates with respect to a dynamic variables (see tapeSwap()).

To build a tape for the score matching discrepancy function, the package first tapes the map from a point z on the end manifold to the value of the improper log-likelihood, where the independent variable is the z, the dynamic parameter is a vector of the parameters to estimate, and the remaining model parameters are fixed and not estimated. This tape is then used to generate a tape for the score matching discrepancy function where the parameters to estimate are the independent variable.

Only some combinations of start, tran and end are available because tran must map between start and end. These combinations of start-tran-end are currently available:

Currently available improper log-likelihood functions are:

Value

A list of:

Introduction to CppAD Tapes

This package uses version 2024000.5 of the algorithmic differentiation library CppAD (Bell 2023) to build score matching estimators. Full help for CppAD can be found at https://cppad.readthedocs.io/.

Differentiation proceeds by taping the basic (atomic) operations performed on the independent variables and dynamic parameters. The atomic operations include multiplication, division, addition, sine, cosine, exponential and many more. Example values for the variables and parameters are used to conduct this taping, so care must be taken with any conditional (e.g. if-then) operations, and CppAD has a special tool for this called CondExp (short for ⁠conditional expressions⁠). The result of taping is an object of class ADFun in CppAD and is often called a tape. This ADFun object can be evaluated, differentiated, used for further taping (via CppAD's base2ad()), solving differential equations and more. The differentiation is with respect to the independent variables, however the dynamic parameters can be altered which allows for creating a new ADFun object where the dynamic parameters become independent variables (see tapeSwap()). For the purposes of score matching, there are also fixed parameters, which are the elements of the model's parameter vector that are given and not estimated.

Warning: multiple CPU

Each time a tape is evaluated the corresponding ⁠C++⁠ object is altered. Parallel use of the same ADFun object thus requires care and is not tested. For now I recommend creating a new ADFun object for each CPU.

Warning

There is no checking of the inputs ytape and usertheta.

References

Bell B (2023). “CppAD: A Package for Differentiation of C++ Algorithms.” https://github.com/coin-or/CppAD.

See Also

Other tape builders: moretapebuilders

Examples

p <- 3
u <- rep(1/sqrt(p), p)
ltheta <- p #length of vMF parameter vector
intheta <- rep(NA, length.out = ltheta)
tapes <- buildsmdtape("sph", "identity", "sph", "vMF",
              ytape = u,
              usertheta = intheta,
              "ones", verbose = FALSE
              )
evaltape(tapes$lltape, u, runif(n = ltheta))
evaltape(tapes$smdtape, runif(n = ltheta), u)

u <- rep(1/3, 3)
tapes <- buildsmdtape("sim", "sqrt", "sph", "ppi",
              ytape = u,
              usertheta = ppi_paramvec(p = 3),
              bdryw = "minsq", acut = 0.01,
              verbose = FALSE
              )
evaltape(tapes$lltape, u, rppi_egmodel(1)$theta)
evaltape(tapes$smdtape, rppi_egmodel(1)$theta, u)

[Package scorematchingad version 0.0.67 Index]