WeatherRegime {CSTools} | R Documentation |
Function for Calculating the Cluster analysis
Description
This function computes the weather regimes from a cluster analysis. It can be applied over the dataset with dimensions c(year/month, month/day, lon, lat), or by using PCs obtained from the application of the EOFs analysis to filter the dataset. The cluster analysis can be performed with the traditional k-means or those methods included in the hclust (stats package).
Usage
WeatherRegime(
data,
ncenters = NULL,
EOFs = TRUE,
neofs = 30,
varThreshold = NULL,
lon = NULL,
lat = NULL,
method = "kmeans",
iter.max = 100,
nstart = 30,
ncores = NULL
)
Arguments
data |
An array containing anomalies with named dimensions with at least start date 'sdate', forecast time 'ftime', latitude 'lat' and longitude 'lon'. |
ncenters |
Number of clusters to be calculated with the clustering function. |
EOFs |
Whether to compute the EOFs (default = 'TRUE') or not (FALSE) to filter the data. |
neofs |
Number of modes to be kept only if EOFs = TRUE has been selected. (default = 30). |
varThreshold |
Value with the percentage of variance to be explained by the PCs. Only sufficient PCs to explain this much variance will be used in the clustering. |
lon |
Vector of longitudes. |
lat |
Vector of latitudes. |
method |
Different options to estimate the clusters. The most traditional approach is the k-means analysis (default=’kmeans’) but the function also support the different methods included in the hclust . These methods are: "ward.D", "ward.D2", "single", "complete", "average" (= UPGMA), "mcquitty" (= WPGMA), "median" (= WPGMC) or "centroid" (= UPGMC). For more details about these methods see the hclust function documentation included in the stats package. |
iter.max |
Parameter to select the maximum number of iterations allowed (Only if method = 'kmeans' is selected). |
nstart |
Parameter for the cluster analysis determining how many random sets to choose (Only if method='kmeans' is selected). |
ncores |
The number of multicore threads to use for parallel computation. |
Value
A list with elements $composite
(array with at least 3-d ('lat',
'lon', 'cluster') containing the composites k = 1,..,K for case (*1) or only k = 1
for any specific cluster, i.e., case (*2)), pvalue
(array with at least
3-d ('lat','lon','cluster') with the pvalue of the composites obtained through
a t-test that accounts for the serial dependence of the data with the same
structure as Composite.), cluster
(A matrix or vector with integers
(from 1:k) indicating the cluster to which each time step is allocated.),
persistence
(Percentage of days in a month/season before a cluster is
replaced for a new one (only if method=’kmeans’ has been selected.)),
frequency
(Percentage of days in a month/season belonging to each
cluster (only if method=’kmeans’ has been selected).),
Author(s)
Verónica Torralba - BSC, veronica.torralba@bsc.es
References
Cortesi, N., V., Torralba, N., González-Reviriego, A., Soret, and F.J., Doblas-Reyes (2019). Characterization of European wind speed variability using weather regimes. Climate Dynamics,53, 4961–4976, doi: 10.1007/s00382-019-04839-5.
Torralba, V. (2019) Seasonal climate prediction for the wind energy sector: methods and tools for the development of a climate service. Thesis. Available online: https://eprints.ucm.es/56841/
Examples
data <- array(abs(rnorm(1280, 283.7, 6)), dim = c(dataset = 2, member = 2,
sdate = 3, ftime = 3,
lat = 4, lon = 4))
lat <- seq(47, 44)
res <- WeatherRegime(data = data, lat = lat,
EOFs = FALSE, ncenters = 4)