R: Iterative Score Matching Estimator Using Conjugate-Gradient...

cppad_search {scorematchingad}

R Documentation

Iterative Score Matching Estimator Using Conjugate-Gradient Descent

Description

Uses conjugate gradient descent to search for a vector of parameters such that gradient of the score matching discrepancy is within tolerance of zero. Also estimates standard errors and covariance.

Usage

cppad_search(
  smdtape,
  theta,
  Y,
  Yapproxcentres = NA * Y,
  w = rep(1, nrow(Y)),
  approxorder = 10,
  control = list(tol = 1e-15, checkgrad = TRUE)
)

Arguments

`smdtape`	A `CppAD` tape of a score matching discrepancy function that has quadratic form. Test for quadratic form using `testquadratic()`. The `smdtape`'s independent variables are assumed to be the model parameters to fit and the `smdtape`'s dynamic parameter is a (multivariate) measurement.
`theta`	The starting parameter set
`Y`	A matrix of multivariate observations. Each row is an observation.
`Yapproxcentres`	A matrix of Taylor approximation centres for rows of Y that require approximation. `NA` for rows that do not require approximation.
`w`	Weights for each observation.
`approxorder`	The order of Taylor approximation to use.
`control`	Control parameters passed to `optimx::Rcgmin()`

Details

The score matching discrepancy function and gradient of the score matching function are passed to optimx::Rcgmin(). The call to optimx::Rcgmin() uses the sum of observations (as opposed to the mean) to reduce floating point inaccuracies. This has implications for the meaning of the control parameters passed to Rcgmin() (e.g. tol). The results are converted into averages so the use of sums can be ignored when not setting control parameters, or studying the behaviour of Rcgmin.

Standard errors use the Godambe information matrix (aka sandwich method) and are only computed when the weights are constant. The estimate of the sensitivity matrix G is the negative of the average over the Hessian of smdtape evaluated at each observation in Y. The estimate of the variability matrix J is then the sample covariance (denominator of n-1) of the gradient of smdtape evaluated at each of the observations in Y for the estimated \theta. The variance of the estimator is then estimated as G^{-1}JG^{-1}/n, where n is the number of observations.

Taylor approximation is available because boundary weight functions and transformations of the measure in Hyvärinen divergence can remove singularities in the model log-likelihood, however evaluation at these singularities may still involve computing intermediate values that are unbounded. If the singularity is ultimately removed, then Taylor approximation from a nearby location will give a very accurate evaluation at the removed singularity.

Warning

There appears to be floating point issues with evaluation of the gradient leading to slow or no convergence in many situations. Tweaking the convergence tolerance tol may be appropriate. If the closed form solution exists please use cppad_closed() instead.

Examples

smdtape <- buildsmdtape("sim", "sqrt", "sph", "ppi",
              ytape = rep(1/3, 3),
              usertheta = ppi_paramvec(p = 3),
              bdryw = "minsq", acut = 0.01,
              verbose = FALSE
              )$smdtape
Y <- rppi_egmodel(100)
cppad_search(smdtape, 0.9 * Y$theta, Y$sample)
sum((smvalues_wsum(smdtape, Y$sample, Y$theta)$grad/nrow(Y$sample))^2)

[Package scorematchingad version 0.0.67 Index]