shannon {SpatEntropy} | R Documentation |
Shannon's entropy.
Description
This function computes Shannon's entropy of a variable X
with a finite number of categories. Shannon's entropy is a non-spatial measure.
Usage
shannon(data)
Arguments
data |
A data matrix or vector, can be numeric, factor, character, ...
Alternatively, a marked |
Details
Shannon's entropy measures the heterogeneity of a set of categorical data. It is computed as
H(X)=\sum p(x_i) \log(1/p(x_i))
where p(x_i)
is the
probability of occurrence of the i
-th category, here estimated, as usual, by its relative
frequency. This is both the non parametric and the maximum likelihood estimator for entropy.
Shannon's entropy varies between 0 and \log(I)
, I
being the
number of categories of the variable under study. The relative version of Shannon's entropy, i.e. the entropy divided by
\log(I)
, is also computed, under the assumption that all data categories are present in the dataset.
The relative entropy is useful for comparison across datasets with differen I
.
The function is able to work with lattice data with missing data, as long as they are specified as NAs:
missing data are ignored in the computations.
Value
a list of four elements:
-
shann
Shannon's entropy -
range
The theoretical range of Shannon's entropy, from 0 to\log(I)
-
rel.shann
Shannon's relative entropy -
probabilities
a table with absolute frequencies and estimated probabilities (relative frequencies) for all data categories
Examples
#NON SPATIAL DATA
shannon(sample(1:5, 50, replace=TRUE))
#POINT DATA
#requires marks with a finite number of categories
data.pp=runifpoint(100, win=square(10))
marks(data.pp)=sample(c("a","b","c"), 100, replace=TRUE)
shannon(marks(data.pp))
#LATTICE DATA
data.lat=matrix(sample(c("a","b","c"), 100, replace=TRUE), nrow=10)
shannon(data.lat)