fast_mds {bigmds} | R Documentation |

Fast MDS uses recursive programming in combination with a divide
and conquer strategy in order to obtain an MDS configuration for a given
large data set `x`

.

```
fast_mds(x, l, s_points, r, n_cores = 1, dist_fn = stats::dist, ...)
```

`x` |
A matrix with |

`l` |
The size for which classical MDS can be computed efficiently
(using |

`s_points` |
Number of points used to align the MDS solutions obtained by
the division of |

`r` |
Number of principal coordinates to be extracted. |

`n_cores` |
Number of cores wanted to use to run the algorithm. |

`dist_fn` |
Distance function used to compute the distance between the rows. |

`...` |
Further arguments passed to |

Fast MDS randomly divides the whole sample data set, `x`

, of size `n`

into `p=`

`l/s_points`

data subsets, where `l`

`\leq \bar{l}`

being `\bar{l}`

the limit size for which classical MDS is applicable. Each one of the `p`

data subsets
has size `\tilde{n} = n/p`

. If `\tilde{n} \leq \code{l}`

then classical MDS is applied
to each data subset. Otherwise, fast MDS is recursively applied.
In either case, a final MDS configuration is obtained for each data subset.

In order to align all the partial solutions, a small subset of size `s_points`

is randomly selected from each data subset. They are joined
to form an alignment set, over which classical MDS is performed giving rise to an
alignment configuration. Every data subset shares `s_points`

points with the alignment
set. Therefore every MDS configuration can be aligned with the alignment configuration
using a Procrustes transformation.

Returns a list containing the following elements:

- points
A matrix that consists of

`n`

individuals (rows) and`r`

variables (columns) corresponding to the principal coordinates. Since we are performing a dimensionality reduction,`r`

`<<k`

- eigen
The first

`r`

largest eigenvalues:`\bar{\lambda}_i, i \in \{1, \dots, r\}`

, where`\bar{\lambda}_i = 1/p \sum_{j=1}^{p}\lambda_i^j/n_j`

, being`\lambda_i^j`

the`i-th`

eigenvalue from partition`j`

and`n_j`

the size of the partition`j`

.- GOF
A numeric vector of length 2.

The first element corresponds to

`1/n \sum_{j=1}^{p}n_jG_1^j`

, where`G_1^j = \sum_{i = 1}^{r} \lambda_{i}^{j}/ \sum_{i = 1}^{n-1} |\lambda_{i}^{j}|`

.The second element corresponds to

`1/n \sum_{j=1}^{p}n_jG_2^j`

where`G_2^j = \sum_{i = 1}^{r} \lambda_{i}^{j}/ \sum_{i = 1}^{n-1} max(\lambda_{i}^{j}, 0).`

Delicado P. and C. Pachón-García (2021). *Multidimensional Scaling for Big Data*.
https://arxiv.org/abs/2007.11919.

Yang, T., J. Liu, L. McMillan, and W.Wang (2006). *A fast approximation to multidimensional scaling*.
In Proceedings of the ECCV Workshop on Computation Intensive Methods for Computer Vision (CIMCV).

Borg, I. and P. Groenen (2005). *Modern Multidimensional Scaling: Theory and Applications*. Springer.

```
set.seed(42)
x <- matrix(data = rnorm(4*10000), nrow = 10000) %*% diag(c(9, 4, 1, 1))
mds <- fast_mds(x = x, l = 200, s_points = 2*2, r = 2, n_cores = 1, dist_fn = stats::dist)
head(mds$points)
mds$eigen
mds$GOF
points <- mds$points
plot(x[1:10, 1],
x[1:10, 2],
xlim = range(c(x[1:10,1],points[1:10,1])),
ylim = range(c(x[1:10,2], points[1:10,2])),
pch = 19,
col = "green")
text(x[1:10, 1], x[1:10, 2], labels=1:10)
points(points[1:10, 1], points[1:10, 2], pch = 19, col = "orange")
text(points[1:10, 1], points[1:10, 2], labels=1:10)
abline(v = 0, lwd=3, lty=2)
abline(h = 0, lwd=3, lty=2)
```

[Package *bigmds* version 2.0.1 Index]