som.nn.multitrain {som.nn}R Documentation

Multi-step hexagonal som training

Description

A self-organising map with hexagonal tolology is trained in several steps and a model of Type SOMnn created for prediction of unknown samples. In contrast to a "normal" som, class-labels for all samples of the training set are required to build the topological model after SOM training.

Usage

som.nn.multitrain(
  x,
  class.col = 1,
  kernel = "internal",
  xdim = 7,
  ydim = 5,
  toroidal = FALSE,
  len = c(0),
  alpha = c(0.2),
  radius = c(0),
  focus = 1,
  norm = TRUE,
  dist.fun = dist.fun.inverse,
  max.dist = 1.1,
  name = "som.nn job"
)

Arguments

x

data.fame with training data. Samples are requested as rows and taken randomly for the training steps. All columns except of the class lables are considered to be attributes and parts of the training vector. One column is needed as class labels. The column with class lables is selected by the argument class.col.

class.col

single string or number. If class is a string, it is considered to be the name of the column with class labels. If class is a number, the respective column will be used as class labels (after beeing coerced to character). Default is 1.

kernel

kernel for som training. One of the predefined kernels "bubble": train with the R-implementation or "gaussian": train with the R-implementation of the Gaussian kernel or "SOM": train with SOM (class::SOM) or "kohonen": train with som (kohonen::som) or "som": train with som (som::som). If a function is specified (as closure, not as character) the specified custom function is used for training.

xdim

dimension in x-direction.

ydim

dimension in y-direction.

toroidal

logical; if TRUE an endless som is trained as on the surface of a torus. default: FALSE.

len

vector of numberis of steps to be trained (steps - not epochs!). the length of len defines the number of training rounds tobe performed.

alpha

initial training rate; the learning rate is decreased linearly to 0.0 for the laset training step. Default: 0.02. If length(alpha) > 1, the length must be tha same as for len and defines different alphas for each training round.

radius

inital radius for SOM training. If Gaussian distance function is used, radius corresponds to sigma. The distance is decreased linearly to 1.0 for the last training step. If radius = 0 (default), the diameter of the SOM is used as initial radius. If length(radius) > 1, the length must be tha same as for len and defines different radii for each training round.

focus

Enhancement factor for focussing of training of "dirty" samples.

norm

logical; if TRUE, input data is normalised by scale(x, TRUE, TRUE).

dist.fun

parameter for k-NN prediction: Function used to calculate distance-dependent weights. Any distance function must accept the two parameters x (distance) and sigma (maximum distance to give a weight > 0.0). Default is dist.fun.inverse.

max.dist

parameter for k-NN prediction: Parameter sigma for dist.fun. Default is 2.1. In order to avoid rounding issues, it is recommended not to use exact integers as limit, but values like 1.1 to make sure, that all neurons within distance 1 are included.

name

optional name for the model. Name will be stored as slot model@name in the trained model.

Details

Besides of the predefined kernels "bubble", "gaussian", "SOM", "kohonen" or "som", any specified custom kernel function can be used for som training. The function must match the signature kernel(data, grid, rlen, alpha, radius, init, toroidal), with arguments:

The returned value must be a list with at minimum one element

If focus > 1 enhancement of dirty samples is activated: Training samples, mapped to neuron with >1 classes, are preferred in the next training step.

Value

    S4 object of type \code{\link{SOMnn}} with the trained model

Examples

## get example data and add class labels:
data(iris)
species <- iris$Species

## train with default radius = diagonal / 2:
rlen <- 500
som <- som.nn.train(iris, class.col = "Species", kernel = "internal",
                    xdim = 15, ydim = 9, alpha = 0.2, len = rlen, 
                    norm = TRUE, toroidal = FALSE)


## continue training with different alpha and radius;
som <- som.nn.continue(som, iris, alpha = 0.02, len=500, radius = 5)
som <- som.nn.continue(som, iris, alpha = 0.02, len=500, radius = 2)

## predict some samples:
unk <- iris[,!(names(iris) %in% "Species")]

setosa <- unk[species=="setosa",]
setosa <- setosa[sample(nrow(setosa), 20),]

versicolor <- unk[species=="versicolor",]
versicolor <- versicolor[sample(nrow(versicolor), 20),]

virginica <- unk[species=="virginica",]
virginica <- virginica[sample(nrow(virginica), 20),]

p <- predict(som, unk)
head(p)

## plot:
plot(som)
dev.off()
plot(som, predict = predict(som, setosa))
plot(som, predict = predict(som, versicolor), add = TRUE, pch.col = "magenta", pch = 17)
plot(som, predict = predict(som, virginica), add = TRUE, pch.col = "white", pch = 8)


[Package som.nn version 1.4.4 Index]