calc_annual_stats {fasstr} | R Documentation |
Calculate annual summary statistics
Description
Calculates means, medians, maximums, minimums, and percentiles for each year from all years of a daily streamflow data set. Calculates statistics from all values, unless specified. Returns a tibble with statistics.
Usage
calc_annual_stats(
data,
dates = Date,
values = Value,
groups = STATION_NUMBER,
station_number,
roll_days = 1,
roll_align = "right",
percentiles = c(10, 90),
water_year_start = 1,
start_year,
end_year,
exclude_years,
months = 1:12,
transpose = FALSE,
complete_years = FALSE,
ignore_missing = FALSE,
allowed_missing = ifelse(ignore_missing, 100, 0)
)
Arguments
data |
Data frame of daily data that contains columns of dates, flow values, and (optional) groups (e.g. station numbers).
Leave blank or set to |
dates |
Name of column in |
values |
Name of column in |
groups |
Name of column in |
station_number |
Character string vector of seven digit Water Survey of Canada station numbers (e.g. |
roll_days |
Numeric value of the number of days to apply a rolling mean. Default |
roll_align |
Character string identifying the direction of the rolling mean from the specified date, either by the first
( |
percentiles |
Numeric vector of percentiles to calculate. Set to |
water_year_start |
Numeric value indicating the month ( |
start_year |
Numeric value of the first year to consider for analysis. Leave blank or set well before start date (i.e.
|
end_year |
Numeric value of the last year to consider for analysis. Leave blank or set well after end date (i.e.
|
exclude_years |
Numeric vector of years to exclude from analysis. Leave blank or set to |
months |
Numeric vector of months to include in analysis. For example, |
transpose |
Logical value indicating whether to transpose rows and columns of results. Default |
complete_years |
Logical values indicating whether to include only years with complete data in analysis. Default |
ignore_missing |
Logical value indicating whether dates with missing values should be included in the calculation. If
|
allowed_missing |
Numeric value between 0 and 100 indicating the percentage of missing dates allowed to be
included to calculate a statistic (0 to 100 percent). If |
Value
A tibble data frame with the following columns:
Year |
calendar or water year selected |
Mean |
annual mean of all daily flows for a given year |
Median |
annual median of all daily flows for a given year |
Maximum |
annual maximum of all daily flows for a given year |
Minimum |
annual minimum of all daily flows for a given year |
P'n' |
each annual n-th percentile selected of all daily flows |
Default percentile columns:
P10 |
annual 10th percentile of all daily flows for a given year |
P90 |
annual 90th percentile of all daily flows for a given year |
Transposing data creates a column of "Statistics" and subsequent columns for each year selected.
Examples
# Run if HYDAT database has been downloaded (using tidyhydat::download_hydat())
if (file.exists(tidyhydat::hy_downloaded_db())) {
# Calculate annual statistics from a data frame using the data argument
flow_data <- tidyhydat::hy_daily_flows(station_number = "08NM116")
calc_annual_stats(data = flow_data)
# Calculate annual statistics using station_number argument
calc_annual_stats(station_number = "08NM116")
# Calculate annual statistics regardless if there
# is missing data for a given year
calc_annual_stats(station_number = "08NM116",
ignore_missing = TRUE)
# Calculate annual statistics for water years starting in October
calc_annual_stats(station_number = "08NM116",
water_year_start = 10)
# Calculate annual statistics for 7-day flows for July-September
# months only, with 25 and 75th percentiles
calc_annual_stats(station_number = "08NM116",
roll_days = 7,
months = 7:9,
percentiles = c(25,75))
}