SOM {EmbedSOM}R Documentation

Build a self-organizing map

Description

Build a self-organizing map

Usage

SOM(
  data,
  xdim = 10,
  ydim = 10,
  zdim = NULL,
  batch = F,
  rlen = 10,
  alphaA = c(0.05, 0.01),
  radiusA = stats::quantile(nhbrdist, 0.67) * c(1, 0),
  alphaB = alphaA * c(-negAlpha, -0.1 * negAlpha),
  radiusB = negRadius * radiusA,
  negRadius = 1.33,
  negAlpha = 0.1,
  epochRadii = seq(radiusA[1], radiusA[2], length.out = rlen),
  init = FALSE,
  initf = Initialize_PCA,
  distf = 2,
  codes = NULL,
  importance = NULL,
  coordsFn = NULL,
  nhbr.method = "maximum",
  noMapping = F,
  parallel = F,
  threads = if (parallel) 0 else 1
)

Arguments

data

Matrix containing the training data

xdim

Width of the grid

ydim

Hight of the grid

zdim

Depth of the grid, causes the grid to be 3D if set

batch

Use batch training (default FALSE chooses online training, which is more like FlowSOM)

rlen

Number of training epochs; or number of times to loop over the training data in online training

alphaA

Start and end learning rate for online learning (only for online training)

radiusA

Start and end radius

alphaB

Start and end learning rate for the second radius (only for online training)

radiusB

Start and end radius (only for online training; make sure it is larger than radiusA)

negRadius

easy way to set radiusB as a multiple of default radius (use lower value for higher dimensions)

negAlpha

the same for alphaB

epochRadii

Vector of length rlen with precise epoch radii (only for batch training)

init

Initialize cluster centers in a non-random way

initf

Use the given initialization function if init==T (default: Initialize_PCA)

distf

Distance function (1=manhattan, 2=euclidean, 3=chebyshev, 4=cosine)

codes

Cluster centers to start with

importance

array with numeric values. Columns of data will be scaled according to importance.

coordsFn

Function to generate/transform grid coordinates (e.g. tSNECoords()). If NULL (default), the grid is the canonical SOM grid.

nhbr.method

Way of computing grid distances, passed as ⁠method=⁠ to stats::dist() function. Defaults to maximum (square neighborhoods); use euclidean for round neighborhoods.

noMapping

If TRUE, do not compute the mapping (default FALSE). Makes the process quicker by 1 rlen.

parallel

Parallelize the batch training by setting appropriate threads. Defaults to FALSE. Always use batch=TRUE for fully parallelized version, online training is not parallelizable. Passed to MapDataToCodes().

threads

Number of threads of the batch training (has no effect on online training). Defaults to 0 (chooses maximum available hardware threads) if parallel==TRUE or 1 (single thread) if parallel==FALSE. Passed to MapDataToCodes().

Value

A map useful for embedding (EmbedSOM() function) or further analysis, e.g. clustering.

See Also

FlowSOM::SOM


[Package EmbedSOM version 2.1.2 Index]