reduced_mds {bigmds} | R Documentation |
Reduced MDS
Description
A data subset is selected and classical MDS is performed on it to obtain the corresponding low dimensional configuration.Then the reaming points are projected onto this initial configuration.
Usage
reduced_mds(x, l, r, n_cores)
Arguments
x |
A matrix with |
l |
The size for which classical MDS can be computed efficiently
(using |
r |
Number of principal coordinates to be extracted. |
n_cores |
Number of cores wanted to use to run the algorithm. |
Details
Gower's interpolation formula is the central piece of this algorithm since it allows to add a new set of points to an existing MDS configuration so that the new one has the same coordinate system.
Given the matrix x
with n
points (rows) and
and k
variables (columns), a first data subsets (based on a random sample)
of size l
is taken and it is used to compute a MDS configuration.
The remaining part of x
is divided into p=({n}-
l
)/l
data subsets (randomly). For every data point, it is obtained a MDS
configuration by means of Gower's interpolation formula and the first
MDS configuration obtained previously. Every MDS configuration is appended
to the existing one so that, at the end of the process, a global MDS
configuration for x
is obtained.
#'This method is similar to landmark_mds()
and interpolation_mds()
.
Value
Returns a list containing the following elements:
- points
A matrix that consists of
n
individuals (rows) andr
variables (columns) corresponding to the principal coordinates. Since we are performing a dimensionality reduction,r
<<k
- eigen
The first
r
largest eigenvalues:\lambda_i, i \in \{1, \dots, r\}
, where each\lambda_i
is obtained from applying classical MDS to the first data subset.
References
Delicado P. and C. Pachón-García (2021). Multidimensional Scaling for Big Data. https://arxiv.org/abs/2007.11919.
Paradis E. (2018). Multidimensional Scaling With Very Large Datasets. Journal of Computational and Graphical Statistics.
Borg, I. and P. Groenen (2005). Modern Multidimensional Scaling: Theory and Applications. Springer.
Gower JC. (1968). Adding a point to vector diagrams in multivariate analysis. Biometrika.
Examples
set.seed(42)
x <- matrix(data = rnorm(4 * 10000), nrow = 10000) %*% diag(c(9, 4, 1, 1))
mds <- reduced_mds(x = x, l = 200, r = 2, n_cores = 1)
head(mds$points)
mds$eigen