smoothSplines {robCompositions} | R Documentation |
Estimate density from histogram
Description
Given raw (discretized) distributional observations, smoothSplines
computes the density
function that 'best' fits data, in a trade-off between smooth and least squares approximation, using B-spline basis functions.
Usage
smoothSplines(
k,
l,
alpha,
data,
xcp,
knots,
weights = matrix(1, dim(data)[1], dim(data)[2]),
num_points = 100,
prior = "default",
cores = 1,
fast = 0
)
Arguments
k |
smoothing splines degree |
l |
order of derivative in the penalization term |
alpha |
weight for penalization |
data |
an object of class "matrix" containing data to be smoothed, row by row |
xcp |
vector of control points |
knots |
either vector of knots for the splines or a integer for the number of equispaced knots |
weights |
matrix of weights. If not given, all data points will be weighted the same. |
num_points |
number of points of the grid where to evaluate the density estimated |
prior |
prior used for zero-replacements. This must be one of "perks", "jeffreys", "bayes_laplace", "sq" or "default" |
cores |
number of cores for parallel execution, if the option was enabled before installing the package |
fast |
1 if maximal performance is required (print statements suppressed), 0 otherwise |
Details
The original discretized densities are not directly smoothed, but instead the centred logratio transformation is
first applied, to deal with the unit integral constraint related to density functions.
Then the constrained variational problem is set. This minimization problem for the optimal
density is a compromise between staying close to the given data, at the corresponding xcp
,
and obtaining a smooth function.
The non-smoothness measure takes into account the l
th derivative, while the fidelity term is weigthed by alpha
.
The solution is a natural spline. The vector of its coefficients is obtained by the minimum norm solution of a linear system.
The resulting splines can be either back-transformed to the original Bayes space of density
functions (in order to provide their smoothed counterparts for vizualization and interpretation
purposes), or retained for further statistical analysis in the clr space.
Value
An object of class smoothSpl
, containing among the other the following variables:
bspline |
each row is the vector of B-spline coefficients |
Y |
the values of the smoothed curve, for the grid given |
Y_clr |
the values of the smoothed curve, in the clr setting, for the grid given |
Author(s)
Alessia Di Blasi, Federico Pavone, Gianluca Zeni, Matthias Templ
References
J. Machalova, K. Hron & G.S. Monti (2016): Preprocessing of centred logratio transformed density functions using smoothing splines. Journal of Applied Statistics, 43:8, 1419-1435.
Examples
SepalLengthCm <- iris$Sepal.Length
Species <- iris$Species
iris1 <- SepalLengthCm[iris$Species==levels(iris$Species)[1]]
h1 <- hist(iris1, nclass = 12, plot = FALSE)
midx1 <- h1$mids
midy1 <- matrix(h1$density, nrow=1, ncol = length(h1$density), byrow=TRUE)
knots <- 7
## Not run:
sol1 <- smoothSplines(k=3,l=2,alpha=1000,midy1,midx1,knots)
plot(sol1)
h1 <- hist(iris1, freq = FALSE, nclass = 12, xlab = "Sepal Length [cm]", main = "Iris setosa")
# black line: kernel method; red line: smoothSplines result
lines(density(iris1), col = "black", lwd = 1.5)
xx1 <- seq(sol1$Xcp[1],tail(sol1$Xcp,n=1),length.out = sol1$NumPoints)
lines(xx1,sol1$Y[1,], col = 'red', lwd = 2)
## End(Not run)