| isopam {isopam} | R Documentation | 
Isopam (Clustering)
Description
Isopam classification is performed either as a hierarchical, divisive method or as non-hierarchical partitioning. Isopam is designed for matrices representing species abundances in plots and with a diagnostic species approach in mind. It optimises clusters and cluster numbers for concentration of indicative species in groups. Predefined indicative species and cluster medoids can optionally be added for a semi-supervised classification.
Usage
     isopam(dat, c.fix = FALSE, c.max = 6, l.max = FALSE, stopat = c(1,7),
            sieve = TRUE, Gs = 3.5, ind = NULL, centers = NULL, 
            distance = 'bray', k.max = 100, d.max = 7, juice = FALSE, ...)
     ## S3 method for class 'isopam'
identify(x, ...)
     ## S3 method for class 'isopam'
plot(x, ...)
     ## S3 method for class 'isopam'
summary(object, ...)
     ## S3 method for class 'isopam'
print(x, ...)
     Arguments
| dat | data matrix: each row corresponds to an object (typically a plot), each column corresponds to a descriptor (typically a species). All variables must be numeric. Missing values (NAs) are not allowed. At least 3 rows (plots) are required. | 
| c.fix | number of clusters (defaults to  | 
| c.max | maximum number of clusters per partition. Applies to all splits. | 
| l.max | maximum number of hierarchy levels. Defaults
to  | 
| stopat | vector with stopping rules for hierarchical
clustering. Two values define if a partition should be
retained in hierarchical clustering: the first determines
how many indicator species must be present per cluster, 
the second defines the standardized G-value that must be 
reached by these indicators.  | 
| sieve | logical. If  | 
| Gs | threshold (standardized G value) for species
to be considered in the search for a good clustering solution. 
Effective with  | 
| ind | optional vector of column names from  | 
| centers | optional vector with indices (numeric) or names (character) of observations used as cluster cores (supervised classification). | 
| distance | name of a dissimilarity index for the distance matrix used as a starting point for Isomap. Any distance measure implemented in packages vegan (predefined or using a designdist equation) or proxy can be used (see details). | 
| k.max | maximum Isomap k. | 
| d.max | maximum number of Isomap dimensions. | 
| juice | logical. If  | 
| ... | other arguments used by juice or passed to S3 
functions  | 
| x | 
 | 
| object | 
 | 
Details
Isopam is described in Schmidtlein et al. (2010). It consists of dimensionality reduction (Isomap: Tenenbaum et al. 2000; isomap in vegan) and partitioning of the resulting ordination space (PAM: Kaufman & Rousseeuw 1990; pam in cluster). The classification is performed either as a hierarchical, divisive method, or as non-hierarchical partitioning. It has the following features: partitions are optimized for the occurrence of species with high fidelity to groups; it optionally selects the number of clusters per division; the shapes of groups in feature space are not restricted to spherical or other regular geometric shapes (thanks to the underlying Isomap algorithm); the distance measure used for the initial distance matrix can be freely defined.
In semi-supervised mode, clusters are build around the provided medoids. Pre-defined indicator species are not as constraining, even if preference is given to cluster solutions in which their fidelity is maximized. It depends on the data how much they affect the result.
The preset distance measure is Bray-Curtis (Odum 1950). Distance measures are passed to vegdist or to designdist in vegan. If this does not work it is passed to dist in proxy. Measures available in vegan are listed in vegdist. Isopam does not accept distance matrices as a replacement for the original data matrix because it operates on individual descriptors (species).
Isopam is slow with large data sets. It switches to a slow mode when an internally used lookup array for the results of the search for an optimal parameterisation (selection of Isomap dimensions and -k, optionally selection of cluster numbers) does not fit into RAM.
plot creates (and silently returns) an object of class 
dendrogram and calls the S3 plot method for that class.
identify works just like identify.hclust. 
Value
| call | generating call | 
| distance | distance measure used by Isomap | 
| flat | observations (plots) with group affiliation. Running group numbers for each level of the hierarchy. | 
| hier | observations (plots) with group affiliation. Group identifiers reflect the cluster hierarchy. Not present with only one level of partitioning. | 
| medoids | observations (plots) representing the medoids of the resulting groups. | 
| analytics | table summarizing parameter settings for
the partitioning steps.  | 
| centers_usr | Cluster centers suggested by user. | 
| ind_usr | Indicators suggested by user. | 
| indicators | Indicators used. | 
| dendro | an object of class  | 
| dat | data used | 
Note
With very small datasets, the indicator based optimization may 
fail. In such cases consider using sieve = FALSE instead 
of the default method.
Author(s)
Sebastian Schmidtlein with contributions from Jason Collison and Lubomir Tichý
References
Odum, E.P. (1950): Bird populations in the Highlands (North Carolina) plateau in relation to plant succession and avian invasion. Ecology 31: 587–605.
Kaufman, L., Rousseeuw, P.J. (1990): Finding groups in data. Wiley.
Schmidtlein, S., Tichý, L., Feilhauer, H., Faude, U. (2010): A brute force approach to vegetation classification. Journal of Vegetation Science 21: 1162–1171.
Tenenbaum, J.B., de Silva, V., Langford, J.C. (2000): A global geometric framework for nonlinear dimensionality reduction. Science 290, 2319–2323.
See Also
isotab for a table of descriptor (species)
frequencies in clusters and fidelity measures. There is a plot 
method associated to isotab objects that visualizes 
species fidelities to clusters.
Examples
     ## load data to the current environment
     data(andechs)
     
     ## call isopam with the standard options
     ip <- isopam(andechs)
     ## print function
     ip
     
     ## examine cluster hierarchy
     plot(ip)
     ## retrieve cluster vectors
     clusters <- ip$flat
     clusters
     
     ## same but hierarchical style (available with cluster trees)
     hierarchy <- ip$hier 
     hierarchy
     ## frequency table
     it <- isotab(ip)
     it
     ## plot with species fidelities (equalized phi)
     plot(it)
     ## non-hierarchical partitioning with three clusters
     ip <- isopam(andechs, c.fix = 3)
     ip
     ## limiting the set of species used in cluster search
     ip <- isopam(andechs, ind = c("Car_pan", "Sch_fer"), c.fix = 2)
     ip
     ## supervised mode with fixed cluster medoids
     ip <- isopam(andechs, centers = c("p20", "p22"))
     ip