R: Shannon's entropy.

shannon {SpatEntropy}

R Documentation

Shannon's entropy.

Description

This function computes Shannon's entropy of a variable X with a finite number of categories. Shannon's entropy is a non-spatial measure.

Usage

shannon(data)

Arguments

data

A data matrix or vector, can be numeric, factor, character, ... Alternatively, a marked ppp object.

Details

Shannon's entropy measures the heterogeneity of a set of categorical data. It is computed as

H(X)=\sum p(x_i) \log(1/p(x_i))

where p(x_i) is the probability of occurrence of the i-th category, here estimated, as usual, by its relative frequency. This is both the non parametric and the maximum likelihood estimator for entropy. Shannon's entropy varies between 0 and \log(I), I being the number of categories of the variable under study. The relative version of Shannon's entropy, i.e. the entropy divided by \log(I), is also computed, under the assumption that all data categories are present in the dataset. The relative entropy is useful for comparison across datasets with differen I. The function is able to work with lattice data with missing data, as long as they are specified as NAs: missing data are ignored in the computations.

Value

a list of four elements:

shann Shannon's entropy
range The theoretical range of Shannon's entropy, from 0 to \log(I)
rel.shann Shannon's relative entropy
probabilities a table with absolute frequencies and estimated probabilities (relative frequencies) for all data categories

Examples

#NON SPATIAL DATA
shannon(sample(1:5, 50, replace=TRUE))

#POINT DATA
#requires marks with a finite number of categories
data.pp=runifpoint(100, win=square(10))
marks(data.pp)=sample(c("a","b","c"), 100, replace=TRUE)
shannon(marks(data.pp))

#LATTICE DATA
data.lat=matrix(sample(c("a","b","c"), 100, replace=TRUE), nrow=10)
shannon(data.lat)

[Package SpatEntropy version 2.2-4 Index]