PeriodStandardization {CSIndicators} | R Documentation |
Compute the Standardization of Precipitation-Evapotranspiration Index
Description
The Standardization of the data is the last step of computing the SPEI indicator. With this function the data is fit to a probability distribution to transform the original values to standardized units that are comparable in space and time and at different SPEI time scales.
Usage
PeriodStandardization(
data,
data_cor = NULL,
dates = NULL,
time_dim = "syear",
leadtime_dim = "time",
memb_dim = "ensemble",
ref_period = NULL,
handle_infinity = FALSE,
method = "parametric",
distribution = "log-Logistic",
params = NULL,
return_params = FALSE,
na.rm = FALSE,
ncores = NULL
)
Arguments
data |
A multidimensional array containing the data to be standardized. |
data_cor |
A multidimensional array containing the data in which the standardization should be applied using the fitting parameters from 'data'. |
dates |
An array containing the dates of the data with the same time dimensions as the data. It is optional and only necessary for using the parameter 'ref_period' to select a reference period directly from dates. |
time_dim |
A character string indicating the name of the temporal dimension. By default, it is set to 'syear'. |
leadtime_dim |
A character string indicating the name of the temporal dimension. By default, it is set to 'time'. |
memb_dim |
A character string indicating the name of the dimension in which the ensemble members are stored. When set it to NULL, threshold is computed for individual members. |
ref_period |
A list with two numeric values with the starting and end points of the reference period used for computing the index. The default value is NULL indicating that the first and end values in data will be used as starting and end points. |
handle_infinity |
A logical value wether to return infinite values (TRUE) or not (FALSE). When it is TRUE, the positive infinite values (negative infinite) are substituted by the maximum (minimum) values of each computation step, a subset of the array of dimensions time_dim, leadtime_dim and memb_dim. |
method |
A character string indicating the standardization method used. If can be: 'parametric' or 'non-parametric'. It is set to 'parametric' by default. |
distribution |
A character string indicating the name of the distribution function to be used for computing the SPEI. The accepted names are: 'log-Logistic' and 'Gamma'. It is set to 'log-Logistic' by default. The 'Gamma' method only works when only precipitation is provided and other variables are 0 because it is positive defined (SPI indicator). |
params |
An optional parameter that needs to be a multidimensional array with named dimensions. This option overrides computation of fitting parameters. It needs to be of same time dimensions (specified in 'time_dim' and 'leadtime_dim') of 'data' and a dimension named 'coef' with the length of the coefficients needed for the used distribution (for 'Gamma' coef dimension is of lenght 2, for 'log-Logistic' is of length 3). It also needs to have a leadtime dimension (specified in 'leadtime_dim') of length 1. It will only be used if 'data_cor' is not provided. |
return_params |
A logical value indicating wether to return parameters array (TRUE) or not (FALSE). It is FALSE by default. |
na.rm |
A logical value indicating whether NA values should be removed from data. It is FALSE by default. If it is FALSE and there are NA values, standardization cannot be carried out for those coordinates and therefore, the result will be filled with NA for the specific coordinates. If it is TRUE, if the data from other dimensions except time_dim and leadtime_dim is not reaching 4 values, it is not enough values to estimate the parameters and the result will include NA. |
ncores |
An integer value indicating the number of cores to use in parallel computation. |
Details
Next, some specifications for the calculation of the standardization will be discussed. If there are NAs in the data and they are not removed with the parameter 'na.rm', the standardization cannot be carried out for those coordinates and therefore, the result will be filled with NA for the specific coordinates. When NAs are not removed, if the length of the data for a computational step is smaller than 4, there will not be enough data for standarize and the result will be also filled with NAs for that coordinates. About the distribution used to fit the data, there are only two possibilities: 'log-logistic' and 'Gamma'. The 'Gamma' method only works when only precipitation is provided and other variables are 0 because it is positive defined (SPI indicator). When only 'data' is provided ('data_cor' is NULL) the standardization is computed with cross validation. For more information about SPEI, see functions PeriodPET and PeriodAccumulation.
Value
A multidimensional array containing the standardized data. If 'data_cor' is provided the array will be of the same dimensions as 'data_cor'. If 'data_cor' is not provided, the array will be of the same dimensions as 'data'. The parameters of the standardization will only be returned if 'return_params' is TRUE, in this case, the output will be a list of two objects one for the standardized data and one for the parameters.
Examples
dims <- c(syear = 6, time = 2, latitude = 2, ensemble = 25)
dimscor <- c(syear = 1, time = 2, latitude = 2, ensemble = 25)
data <- array(rnorm(600, -194.5, 64.8), dim = dims)
datacor <- array(rnorm(100, -217.8, 68.29), dim = dimscor)
SPEI <- PeriodStandardization(data = data)
SPEIcor <- PeriodStandardization(data = data, data_cor = datacor)