R: Sparsified (POST-) Multidimensional Scaling (SPMDS or SMDS)...

spmds {smacofx}

R Documentation

Sparsified (POST-) Multidimensional Scaling (SPMDS or SMDS) either as self-organizing or not

Description

An implementation of a sparsified version of (POST-)MDS by pseudo-majorization with ratio, interval and ordinal optimal scaling for dissimilarities and optional power transformations. This is inspired by curvilinear component analysis but works differently: It finds an initial weightmatrix where w_ij(X^0)=0 if d_ij(X^0)>tau and fits a POST-MDS with these weights. Then in each successive iteration step, the weightmat is recalculated so that w_ij(X^(n+1))=0 if d_ij(X^(n+1))>tau.

Usage

spmds(
  delta,
  lambda = 1,
  kappa = 1,
  nu = 1,
  tau,
  type = "ratio",
  ties = "primary",
  weightmat = 1 - diag(nrow(delta)),
  init = NULL,
  ndim = 2,
  acc = 1e-06,
  itmax = 10000,
  verbose = FALSE,
  principal = FALSE
)

smds(
  delta,
  tau = stats::quantile(delta, 0.9),
  type = "ratio",
  ties = "primary",
  weightmat = 1 - diag(nrow(delta)),
  init = NULL,
  ndim = 2,
  acc = 1e-06,
  itmax = 10000,
  verbose = FALSE,
  principal = FALSE
)

so_spmds(
  delta,
  kappa = 1,
  lambda = 1,
  nu = 1,
  tau = max(delta),
  epochs = 10,
  type = "ratio",
  ties = "primary",
  weightmat = 1 - diag(nrow(delta)),
  init = NULL,
  ndim = 2,
  acc = 1e-06,
  itmax = 10000,
  verbose = FALSE,
  principal = FALSE
)

so_smds(
  delta,
  tau = max(delta),
  epochs = 10,
  type = "ratio",
  ties = "primary",
  weightmat = 1 - diag(nrow(delta)),
  init = NULL,
  ndim = 2,
  acc = 1e-06,
  itmax = 10000,
  verbose = FALSE,
  principal = FALSE
)

Arguments

`delta`	dist object or a symmetric, numeric data.frame or matrix of distances
`lambda`	exponent of the power transformation of the dissimilarities; defaults to 1, which is also the setup of 'smds'
`kappa`	exponent of the power transformation of the fitted distances; defaults to 1, which is also the setup of 'smds'.
`nu`	exponent of the power of the weighting matrix; defaults to 1 which is also the setup for 'smds'.
`tau`	the boundary/neighbourhood parameter(s) (called lambda in the original paper). For 'spmds' and 'smds' it is supposed to be a numeric scalar (if a sequence is supplied the maximum is taken as tau) and all the transformed fitted distances exceeding tau are set to 0 via the weightmat (assignment can change between iterations). It defaults to the 90% quantile of delta. For 'so_spmds' tau is supposed to be either a user supplied decreasing sequence of taus or if a scalar the maximum tau from which a decreasing sequence of taus is generated automatically as 'seq(from=tau,to=tau/epochs,length.out=epochs)' and then used in sequence.
`type`	what type of MDS to fit. Currently one of "ratio", "interval" or "ordinal". Default is "ratio".
`ties`	the handling of ties for ordinal (nonmetric) MDS. Possible are "primary" (default), "secondary" or "tertiary".
`weightmat`	a matrix of finite weights.
`init`	starting configuration. If NULL (default) we fit a full rstress model.
`ndim`	dimension of the configuration; defaults to 2
`acc`	numeric accuracy of the iteration. Default is 1e-6.
`itmax`	maximum number of iterations. Default is 10000.
`verbose`	should iteration output be printed; if > 1 then yes
`principal`	If 'TRUE', principal axis transformation is applied to the final configuration
`epochs`	for 'so_spmds' and tau being scalar, it gives the number of passes through the data. The sequence of taus created is 'seq(tau,tau/epochs,length.out=epochs)'. If tau is of length >1, this argument is ignored.

Details

There is a wrapper 'smds' where the exponents are 1, which is standard SMDS but extend to allow optimal scaling. The neighborhood parameter tau is kept fixed in 'spmds' and 'smds'. The functions 'so_spmds' and 'so_smds' implement a self-organising principle, where the SMDS is repeatedly fitted for a decreasing sequence of taus.

The solution is found by "quasi-majorization", which means that the majorization is only real majorization once the weightmat no longer changes. This typically happens after a few iterations. Due to that it can be that in the beginning the stress may not decrease monotonically and that there's a chance it might never.

If tau is too small it may happen that all distances for one i to all j are zero and then there will be an error, so make sure to set a larger tau.

In the standard functions 'spmds' and 'smds' we keep tau fixed throughout. This means that if tau is large enough, then the result is the same as the corresponding MDS. In the orginal publication the idea was that of a self-organizing map which decreased tau over epochs (i.e., passes through the data). This can be achieved with our function 'so_spmds' 'so_smds' which creates a vector of decreasing tau values, calls the function 'spmds' with the first tau, then supplies the optimal configuration obtained as the init for the next call with the next tau and so on.

Value

a 'smacofP' object (inheriting from 'smacofB', see smacofSym). It is a list with the components

delta: Observed, untransformed dissimilarities
tdelta: Observed explicitly transformed dissimilarities, normalized
dhat: Explicitly transformed dissimilarities (dhats), optimally scaled and normalized
confdist: Configuration dissimilarities
conf: Matrix of fitted configuration
stress: Default stress (stress 1; sqrt of explicitly normalized stress)
spp: Stress per point
ndim: Number of dimensions
model: Name of smacof model
niter: Number of iterations
nobj: Number of objects
type: Type of MDS model
weightmat: weighting matrix as supplied
stress.m: Default stress (stress-1^2)
tweightmat: transformed weighting matrix; it is weightmat but containing all the 0s for the distances set to 0.

Examples

dis<-smacof::morse
res<-spmds(dis,type="interval",kappa=2,lambda=2,tau=0.3,itmax=100) #use higher itmax
res2<-smds(dis,type="interval",tau=0.3,itmax=500) #use higher itmax
res
res2
summary(res)
oldpar<-par(mfrow=c(1,2))
plot(res)
plot(res2)
par(oldpar)

##which d_{ij}(X)^kappa exceeded tau at convergence (i.e., have been set to 0)?
res$tweightmat
res2$tweightmat


## Self-organizing map style (as in the clca publication)
#run the som-style (p)smds 
sommod1<-so_spmds(dis,tau=1,kappa=0.5,lambda=2,epochs=10,verbose=1)
sommod2<-so_smds(dis,tau=1,epochs=10,verbose=1)
sommod1
sommod2

[Package smacofx version 1.5-3 Index]