fmdsd {dad} | R Documentation |
Multidimensional scaling of probability densities
Description
Applies the multidimensional scaling (MDS) method to probability densities in order to describe a data folder, consisting of groups of individuals on which are observed
variables. It returns an object of class
fmdsd
. It applies cmdscale
to the distance matrix between the densities.
Usage
fmdsd(xf, group.name = "group", gaussiand = TRUE, distance = c("jeffreys", "hellinger",
"wasserstein", "l2", "l2norm"), windowh=NULL, data.centered = FALSE,
data.scaled = FALSE, common.variance = FALSE, add = TRUE, nb.factors = 3,
nb.values = 10, sub.title = "", plot.eigen = TRUE, plot.score = FALSE, nscore = 1:3,
filename = NULL)
Arguments
xf |
object of class
|
group.name |
string.
|
gaussiand |
logical. If |
distance |
The distance or divergence used to compute the distance matrix between the densities. If
If |
windowh |
either a list of Omitted when |
data.centered |
logical. If |
data.scaled |
logical. If |
common.variance |
logical. If |
add |
logical indicating if an additive constant should be computed and added to the non diagonal dissimilarities such that the modified dissimilarities are Euclidean (default |
nb.factors |
numeric. Number of returned principal coordinates (default Warning: The |
nb.values |
numeric. Number of returned eigenvalues (default |
sub.title |
string. Subtitle for the graphs (default |
plot.eigen |
logical. If |
plot.score |
logical. If |
nscore |
numeric vector. If |
filename |
string. Name of the file in which the results are saved. By default ( |
Details
In order to compute the distances/dissimilarities between the groups, the probability densities
corresponding to the
groups of individuals are either parametrically estimated (
gaussiand = TRUE
) or estimated using the Gaussian kernel method (gaussiand = FALSE
). In the latter case, the windowh
argument provides the list of the bandwidths to be used. Notice that in the multivariate case (>1), the bandwidths are positive-definite matrices.
If windowh
is a numerical value, the matrix bandwidth is of the form , where
is either the square root of the covariance matrix (
>1) or the standard deviation of the estimated density.
If windowh = NULL
(default), in the above formula is computed using the
bandwidth.parameter
function.
The distance or dissimilarity between the estimated densities is either the distance, the Hellinger distance, Jeffreys measure (symmetrised Kullback-Leibler divergence) or the Wasserstein distance.
If it is the
L^2
distance (distance="l2"
ordistance="l2norm"
), the densities can be either parametrically estimated or estimated using the Gaussian kernel.If it is the Hellinger distance (
distance="hellinger"
), Jeffreys measure (distance="jeffreys"
) or the Wasserstein distance (distance="wasserstein"
), the densities are considered Gaussian and necessarily parametrically estimated.
Value
Returns an object of class fmdsd
, i.e. a list including:
inertia |
data frame of the eigenvalues and percentages of inertia. |
scores |
data frame of the |
means |
list of the means. |
variances |
list of the covariance matrices. |
correlations |
list of the correlation matrices. |
skewness |
list of the skewness coefficients. |
kurtosis |
list of the kurtosis coefficients. |
Author(s)
Rachid Boumaza, Pierre Santagostini, Smail Yousfi, Gilles Hunault, Sabine Demotes-Mainard
References
Boumaza, R., Yousfi, S., Demotes-Mainard, S. (2015). Interpreting the principal component analysis of multivariate density functions. Communications in Statistics - Theory and Methods, 44 (16), 3321-3339.
Delicado, P. (2011). Dimensionality reduction when data are density functions. Computational Statistics & Data Analysis, 55, 401-420.
Yousfi, S., Boumaza, R., Aissani, D., Adjabi, S. (2014). Optimal bandwith matrices in functional principal component analysis of density function. Journal of Statistical Computation and Simulation, 85 (11), 2315-2330.
Cox, T.F., Cox, M.A.A. (2001). Multimensional Scaling, second ed. Chapman & Hall/CRC.
See Also
fpcad print.fmdsd, plot.fmdsd, interpret.fmdsd, bandwidth.parameter
Examples
data(roses)
rosesf <- as.folder(roses[,c("Sha","Den","Sym","rose")])
# MDS on Gaussian densities (on sensory data)
# using jeffreys measure (default):
resultjeff <- fmdsd(rosesf, distance = "jeffreys")
print(resultjeff)
plot(resultjeff)
## Not run:
# Applied to a data frame:
resultjeffdf <- fmdsd(roses[,c("Sha","Den","Sym","rose")],
distance = "jeffreys", group.name = "rose")
print(resultjeffdf)
plot(resultjeffdf)
## End(Not run)
# using the Hellinger distance:
resulthellin <- fmdsd(rosesf, distance = "hellinger")
print(resulthellin)
plot(resulthellin)
# using the Wasserstein distance:
resultwass <- fmdsd(rosesf, distance = "wasserstein")
print(resultwass)
plot(resultwass)
# Gaussian case, using the L2-distance:
resultl2 <- fmdsd(rosesf, distance = "l2")
print(resultl2)
plot(resultl2)
# Gaussian case, using the L2-distance between normed densities:
resultl2norm <- fmdsd(rosesf, distance = "l2norm")
print(resultl2norm)
plot(resultl2norm)
## Not run:
# Non Gaussian case, using the L2-distance,
# the densities are estimated using the Gaussian kernel method:
result <- fmdsd(rosesf, distance = "l2", gaussiand = FALSE, group.name = "rose")
print(result)
plot(result)
## End(Not run)