traceWminim {nsRFA} | R Documentation |
Cluster analysis: disjoint regions
Description
Formation of disjoint regions for Regional Frequency Analysis.
Usage
traceWminim (X, centers)
sumtraceW (clusters, X)
nearest (clusters, X)
Arguments
X |
a numeric matrix of characteristics, or an object that can be coerced to such a matrix (such as a numeric vector or a data frame with all numeric columns) |
centers |
the number of clusters |
clusters |
a numeric vector containing the subdivision of |
Details
The Euclidean distance is used.
Given p
different classification variables, the distance between two elements i
and j
is:
d_{i j} = \sqrt{\frac{1}{p} \sum_{h=1}^{p} (x_{h i} - x_{h j})^2}
where x_{h i}
is the value of the h
-th variable of the i
-th element.
The function traceWminim
is a composition of a jerarchical algorithm, the Ward (1963) one, and an optimisation procedure consisting in the minimisation of:
W = \sum_{i=1}^k \left( \sum_{j=1}^{n_i} \delta_{i j}^2 \right)
where
k
is the number of clusters (obtained initially with Ward's algorithm), n_i
is the number of sites in the i
-th cluster and \delta_{i j}
is the Euclidean distance between the j
-th element of the i
-th group and the center of mass of the i
-th cluster.
W
is calculated with sumtraceW
.
The algorithm consist in moving a site from one cluster to another if this makes W
decrease.
Value
traceWminim
gives a vector defining the subdivision of elements characterized by X
in n=centers
clusters.
sumtraceW
gives W
(it is used by traceWminim
).
nearest
gives the nearest site to the centers of mass of clusters (it is used by traceWminim
).
Note
For information on the package and the Author, and for all the references, see nsRFA
.
See Also
Examples
data(hydroSIMN)
parameters
summary(parameters)
# traceWminim
param <- parameters[c("Hm","Ybar")]
n <- dim(param)[1]; k <- dim(param)[2]
param.norm <- (param - matrix(apply(param,2,mean),nrow=n,ncol=k,
byrow=TRUE))/matrix(apply(param,2,sd),
nrow=n,ncol=k,byrow=TRUE)
clusters <- traceWminim(param.norm,4);
names(clusters) <- parameters["cod"][,]
clusters
annualflows
summary(annualflows)
x <- annualflows["dato"][,]
cod <- annualflows["cod"][,]
fac <- factor(annualflows["cod"][,],
levels=names(clusters[clusters==1]))
x1 <- annualflows[!is.na(fac),"dato"]
cod1 <- annualflows[!is.na(fac),"cod"]
#HW.tests(x1,cod1) # it takes some time
fac <- factor(annualflows["cod"][,],
levels=names(clusters[clusters==3]))
x3 <- annualflows[!is.na(fac),"dato"]
cod3 <- annualflows[!is.na(fac),"cod"]
#HW.tests(x3,cod3) # it takes some time