| calc_monthly_stats {fasstr} | R Documentation | 
Calculate monthly summary statistics
Description
Calculates means, medians, maximums, minimums, and percentiles for each month of all years of flow values from a daily streamflow data set. Calculates statistics from all values, unless specified. Returns a tibble with statistics.
Usage
calc_monthly_stats(
  data,
  dates = Date,
  values = Value,
  groups = STATION_NUMBER,
  station_number,
  percentiles = c(10, 90),
  roll_days = 1,
  roll_align = "right",
  water_year_start = 1,
  start_year,
  end_year,
  exclude_years,
  months = 1:12,
  transpose = FALSE,
  spread = FALSE,
  complete_years = FALSE,
  ignore_missing = FALSE,
  allowed_missing = ifelse(ignore_missing, 100, 0)
)
Arguments
| data | Data frame of daily data that contains columns of dates, flow values, and (optional) groups (e.g. station numbers).
Leave blank or set to  | 
| dates | Name of column in  | 
| values | Name of column in  | 
| groups | Name of column in  | 
| station_number | Character string vector of seven digit Water Survey of Canada station numbers (e.g.  | 
| percentiles | Numeric vector of percentiles to calculate. Set to  | 
| roll_days | Numeric value of the number of days to apply a rolling mean. Default  | 
| roll_align | Character string identifying the direction of the rolling mean from the specified date, either by the first 
( | 
| water_year_start | Numeric value indicating the month ( | 
| start_year | Numeric value of the first year to consider for analysis. Leave blank or set well before start date (i.e.
 | 
| end_year | Numeric value of the last year to consider for analysis. Leave blank or set well after end date (i.e.
 | 
| exclude_years | Numeric vector of years to exclude from analysis. Leave blank or set to  | 
| months | Numeric vector of months to include in analysis. For example,  | 
| transpose | Logical value indicating if each month statistic should be individual rows. Default  | 
| spread | Logical value indicating if each month statistic should be the column name. Default  | 
| complete_years | Logical values indicating whether to include only years with complete data in analysis. Default  | 
| ignore_missing | Logical value indicating whether dates with missing values should be included in the calculation. If
 | 
| allowed_missing | Numeric value between 0 and 100 indicating the percentage of missing dates allowed to be
included to calculate a statistic (0 to 100 percent). If  | 
Value
A tibble data frame with the following columns:
| Year | calendar or water year selected | 
| Month | month of the year | 
| Mean | mean of all daily flows for a given month and year | 
| Median | median of all daily flows for a given month and year | 
| Maximum | maximum of all daily flows for a given month and year | 
| Minimum | minimum of all daily flows for a given month and year | 
| P'n' | each n-th percentile selected for a given month and year | 
Default percentile columns:
| P10 | 10th percentile of all daily flows for a given month and year | 
| P90 | 90th percentile of all daily flows for a given month and year | 
Transposing data creates a column of 'Statistics' for each month, labeled as 'Month-Statistic' (ex "Jan-Mean"), and subsequent columns for each year selected. Spreading data creates columns of Year and subsequent columns of Month-Statistics (ex 'Jan-Mean').
Examples
# Run if HYDAT database has been downloaded (using tidyhydat::download_hydat())
if (file.exists(tidyhydat::hy_downloaded_db())) {
# Calculate statistics using a data frame and data argument with defaults
flow_data <- tidyhydat::hy_daily_flows(station_number = "08NM116")
calc_monthly_stats(data = flow_data,
                   start_year = 1980)
# Calculate statistics using station_number argument with defaults
calc_monthly_stats(station_number = "08NM116",
                   start_year = 1980)
# Calculate statistics regardless if there is missing data for a given year
calc_monthly_stats(station_number = "08NM116",
                   ignore_missing = TRUE)
                  
# Calculate statistics for water years starting in October
calc_monthly_stats(station_number = "08NM116",
                   start_year = 1980,
                   water_year_start = 10)
                  
# Calculate statistics with custom years and percentiles
calc_monthly_stats(station_number = "08NM116",
                   start_year = 1981,
                   end_year = 2010,
                   exclude_years = c(1991,1993:1995),
                   percentiles = c(25,75))
                   
}