R: Summarize subgroup statistics

numero.summary {Numero}

R Documentation

Summarize subgroup statistics

Description

Estimates subgroup statistics after self-organizing map analysis

Usage

numero.summary(results, topology, data = NULL, capacity = 10)

Arguments

`results`	A list object that contains the self-organizing map and its statistical colorings.
`topology`	A SOM topology with additional labels that indicate selected regions.
`data`	A matrix or a data frame.
`capacity`	Maximum number of subgroups to compare.

Details

The input results must contain the output from numero.evaluate() or similar.

The input argument topology must be a definition of a SOM with additional columns as in the output from numero.subgroup().

The function first looks for row names in data that are also included in results. The rows are then divided into subgroups according to the district assignments in results and the region labels in topology.

Value

A data frame of summary statistics, see nroSummary() for details. The data frame also contains additional information on which variables were used for the training of the SOM.

The attribute 'layout' is added to the output. It indicates the location on the map and the subgroup name and label for each data row that were included in the analysis.

Author(s)

Ville-Petteri Makinen

Examples

# Import data.
fname <- system.file("extdata", "finndiane.txt", package = "Numero")
dataset <- read.delim(file = fname)

# Set identities and manage missing data.
dataset <- numero.clean(dataset, identity = "INDEX")

# Prepare training variables.
trvars <- c("CHOL", "HDL2C", "TG", "CREAT", "uALB")
trdata <- numero.prepare(data = dataset, variables = trvars)

# Create a self-organizing map.
sm <- numero.create(data = trdata)
qc <- numero.quality(model = sm)

# Evaluate map statistics for all variables.
stats <- numero.evaluate(model = qc, data = dataset)

# Define subgroups.
x <- stats$planes[,"uALB"]
tops <- which(x >= quantile(x, 0.75, na.rm=TRUE))
bottoms <- which(x <= quantile(x, 0.25, na.rm=TRUE))
elem <- data.frame(stats$map$topology, stringsAsFactors = FALSE)
elem$REGION <- "MiddleAlb"
elem$REGION[tops] <- "HighAlb"
elem$REGION[bottoms] <- "LowAlb"
elem$REGION.label <- "M"
elem$REGION.label[tops] <- "H"
elem$REGION.label[bottoms] <- "L"

# Compare subgroups.
cmp <- numero.summary(results = stats, topology = elem, data = dataset)

[Package Numero version 1.9.7 Index]