dahstat {climatol} R Documentation

## Statistical summaries of the homogenized data

### Description

Lists means, standard deviations, quantiles or trends, for a specified period, from series homogenized by homogen.

### Usage

dahstat(varcli, anyi, anyf, anyip=anyi, anyfp=anyf, stat="me", ndc=NA, vala=2,
cod=NULL, mnpd=0, mxsh=0, prob=.5, last=FALSE, long=FALSE, lsnh=FALSE,
lerr=FALSE, relref=FALSE, mh=FALSE, pernys=100, estcol=c(1,2,4), sep=',',
dec='.', eol='\n', nei=NA, x=NA)


### Arguments

 varcli Acronym of the name of the studied climatic variable, as in the data file name. anyi Initial year of the homogenized period. anyf Final year of the homogenized period. anyip First year of the period to analyze. (Defaults to anyi). anyfp Last year of the period to analyze. (Defaults to anyf). stat Statistical parameter to compute for the selected period: "me":Means (default), "mdn"Medians, "max"Maxima, "min"Minima, "std"Standard deviations, "q"Quantiles (see the prob parameter), "tnd"Trends, "series"Do not compute any statistics; only output all homogenized series in individual *.csv files. ndc Number of decimal places to be saved in the output file (1 by default). vala Annual values to compute from the sub-annual data: 0:None, 1:Sums, 2:Means (default), 3:Maxima, 4:Minima. cod Optional vector of codes of the stations to be processed. mnpd Minimum percentage of original data. (0 = no limit). mxsh Maximum SNHT. (0 = no limit). prob Probability for the computation of quantiles (0.5 by default, i.e., medians). You can set probabilities with more than 2 decimals, but the name of the output file will be identified with the rounded percentile. last Logical value to compute statistics only for stations working at the end of the period of study. (FALSE by default). long Logical value to compute statistics only for series built from the longest homogeneous sub-period. (FALSE by default). lsnh Logical value to compute statistics from series built from the homogeneous sub-period with lowest SNHT. (FALSE by default). lerr Logical value to compute statistics only for series built from the homogeneous sub-period with lowest RMSE. (FALSE by default). relref If TRUE, statistics from reliable reference series will be also listed. (FALSE by default). mh If TRUE, read monthly data computed from daily adjusted series. (FALSE by default). pernys Number of years on which to compute trends. (Defaults to 100). estcol Columns of the homogenized stations file to be included in the output file. (Defaults to c(1,2,4), the columns of station coordinates and codes). sep String to use for separating the output data. (','). dec Character to use as decimal point in the output data. ('.'). eol Line termination style. ('\n'). nei Number of stations in the input files. (To be read from the *.rda file.) x Vector of dates. (To be read from the *.rda file.)

### Details

Homogenized data are read from the file ‘VAR_ANYI-ANYF.rda’ saved by homogen, while this function saves the computed data for the specified period in ‘VAR_ANYIP-ANYFP.STAT’, where STAT is substituted by the stat requested statistic. An exception is when stat="q", since then the extension of the output file will be qPP, where PP stands for the specified prob probability (in percent). The output period ANYIP-ANYFP must of course be comprised within the period of the input data, ANYI-ANYF.

Parameters mnpd and mxsh act as filters to produce results only for series that have those minimum percentages of original data and maximum SNHT values. Alternatively, long, last, lsnh and lerr allow the selection of series reconstructed from the preferred homogeneous sub-period, depending on the parameter set to TRUE. However, note that in many cases the shorter sub-periods may have lower SNHT and RMSE values, and therefore parameters lsnh and lerr should be used with caution. The most advisable paramenters to select most suitable reconstructions are long for computing normal values and last for climate monitoring of new incoming data.

to select only those stations working at the end of the period studied. No selection is performed by default, listing the desired statistic for all the reconstructed series (from every homogeneous sub-period).

stat='tnd' computes trends by OLS linear regression on time, listing them in a CSV file ‘*_tnd.csv’ and their p-values in ‘*_pval.csv

If stat='series' is chosen, two text files in CSV format will be produced for every station, one with the data and another with their flags: 0 for original, 1 for infilled and 2 for corrected data. (Not useful for daily series.)

### Value

This function does not return any value, since outputs are saved to files.

homogen, dahgrid.

### Examples

#Set a temporal working directory and write input files:
wd <- tempdir()
wd0 <- setwd(wd)
data(Ptest)
dim(dat) <- c(720,20)
dat[601:720,5] <- dat[601:720,5]*1.8
write(dat[481:720,1:5],'pcp_1991-2010.dat')
write.table(est.c[1:5,1:5],'pcp_1991-2010.est',row.names=FALSE,col.names=FALSE)
homogen('pcp',1991,2010,std=2)
#Now run the examples:
dahstat('pcp',1991,2010)
dahstat('pcp',1991,2010,stat='tnd')