R: Smoothness-Penalized Deconvolution

sped {spedecon}

R Documentation

Smoothness-Penalized Deconvolution

Description

sped() computes the Smoothness-Penalized Deconvolution estimate on the provided data and error distribution

Usage

sped(
  Y,
  gtwid,
  alpha,
  constraint = "constrainedQP",
  spline_dim = 30,
  hn = NULL,
  ephemera = NULL,
  ...
)

Arguments

`Y`	Numeric vector of data from the model `Y = X + Z`
`gtwid`	Object of class spedecon_gtwid describing the density of `Z` in the model `Y = X + Z`. It should almost always be created by one of the helper functions `gaussian_gtwid()`, `laplace_gtwid()`, or `uniform_gtwid()`.
`alpha`	Positive numeric penalty parameter
`constraint`	String, controls whether and how the solution is constrained to be a pdf. One of `"constrainedQP"`, `"projection"`, or `"unconstrained"` for constrained quadratic program, metric projection, or unconstrained, respectively
`spline_dim`	Numeric integer, dimension of spline space
`hn`	(optional) Object of class `histogram` holding pre-computed histogram computed from the data `Y`
`ephemera`	(optional) Object of class `spedecon_ephemera` holding pre-computed computational bits
`...`	(optional) Other arguments

Details

This function computes the "Smoothness-Penalized Deconvolution" (SPeD) estimate of a density under additive measurement error. The essential inputs to the function are the data Y, the Fourier transform gtwid of the error density, and the penalty parameter alpha; more details follow here, but for a full description of the estimator please consult Kent and Ruppert (2023).

The data model is that we observe an iid sample distributed like Y = X + Z, with Z an error independent of X. We wish to estimate the density f(x) of X. It is assumed that we know the probability density of the errors Z, call it g(z).

The estimator begins with a density estimate h_n(y) of the density of Y, and minimizes the objective function

\|g*v - h_n\|^2 + \alpha \|v^{(2)}\|^2

in v, with v ranging over a space of cubic splines with equally-spaced knots; the dimension of this space can be adjusted with the argument spline_dim. The SPeD estimate is not naturally a pdf, so it must be constrained. When constraint = "constrainedQP", the constraint is imposed directly into quadratic program minimizing the objective; when constraint = "projection", the unconstrained estimate is computed and then projected onto the space of pdfs. The preliminary density estimate h_n is computed internally as a histogram using Freedman-Diaconis choice of bin width, but a user-supplied histogram computed with hist() may be provided via the hn argument.

The computations require the Fourier transform \tilde g(t) of the probability density, and this must be supplied as an object of type spedecon_gtwid, which can be produced for common error densities using the helper functions gaussian_gtwid(), laplace_gtwid(), and uniform_gtwid().

If the estimator will be re-computed many times for many realizations of data, substantial time can be saved by pre-computing all the auxiliary matrices and vectors one time, and supplying them through the ephemera argument. This can be done whenever the repeated computations all use the same error density, same histogram bins, and same spline space, as those are what define the required matrices and vectors. A helper function compute_ephemera() is provided to pre-compute these.

Value

Object of class spedecon_spline_sped_fit

References

Kent D, Ruppert D (2023). “Smoothness-Penalized Deconvolution (SPeD) of a Density Estimate.” Journal of the American Statistical Association, to appear. ISSN 0162-1459, doi:10.1080/01621459.2023.2259028

Examples

alpha <- 1e-3
n <- 1e3; s <- 0.3
Y <- rgamma(n,5,2) + rnorm(n,0,s) # Data, contaminated with Gaussian errors
sol <- sped(Y,gtwid=gaussian_gtwid(sd=s),1e-3)
plot(sol,n=1e3) # Plot the resulting estimate
curve(dgamma(x,5,2),col=2,n=1e3,add=TRUE) # The target density f() of X
sol(c(2,3,4)) # We can evaluate sol; it is a function

[Package spedecon version 0.1 Index]