nroPermute {Numero}R Documentation

Permutation analysis of map layout

Description

Estimate the dynamic range and statistical significance for regional patterns on a self-organizing maps using permutations.

Usage

nroPermute(map, districts, data, n = 1000, message = NULL,
           zbase = NULL, seed = 0.0)

Arguments

map

A list object in the format from nroTrain().

districts

An integer vector of M best matching districts.

data

A numeric vector of M values or an M x N matrix (or data frame), where M is the number of data points and N is the number of variables.

n

Maximum number of permutations per variable.

message

If positive, progress information is printed at the specified interval in seconds.

zbase

Reference Z-score for determining color amplitudes.

seed

Seed value for random number generator.

Details

The input argument map must contain the map topology and the centroid profiles as returned by the functions nroKmeans(), nroKohonen(), or nroTrain().

The input argument districts must contain integers between 1 and K, where K is the number map units. Any other values will be ignored.

Training variables and data points are detected by the column names of map$centroids, the attribute "variables" in districts and the names of elements in districts.

Value

A data frame with eight columns: P.z is a parametric estimate for statistical significance, P.freq is the frequency-based estimate for statistical signicance, and Z is the estimated z-score of how far the observed map plane is from the average randomly generated layout. N.data indicates how many data values were used and N.cycles tells the number of completed permutations. AMPLITUDE is a dynamic range modifier for colors that can be used in nroColorize().

The output also contains the attribute 'zbase' that indicates the normalization factor for the color amplitudes.

Examples

# Import data.
fname <- system.file("extdata", "finndiane.txt", package = "Numero")
dataset <- read.delim(file = fname)

# Set row names.
rownames(dataset) <- paste("r", 1:nrow(dataset), sep="")

# Prepare training data.
trvars <- c("CHOL", "HDL2C", "TG", "CREAT", "uALB")
trdata <- scale.default(dataset[,trvars])

# K-means clustering.
km <- nroKmeans(data = trdata)

# Self-organizing map.
sm <- nroKohonen(seeds = km)
sm <- nroTrain(map = sm, data = trdata)

# Assign data points into districts.
matches <- nroMatch(centroids = sm, data = trdata)

# Estimate statistics for cholesterol
chol <- nroPermute(map = sm, districts = matches, data = dataset$CHOL)
print(chol[,c("TRAINING", "Z", "P.z", "P.freq")])

# Estimate statistics.
stats <- nroPermute(map = sm, districts = matches, data = dataset)
print(stats[,c("TRAINING", "Z", "P.z", "P.freq")])

[Package Numero version 1.9.7 Index]