anlzMWRoutlier {MassWateR}R Documentation

Analyze outliers in results file

Description

Analyze outliers in results file

Usage

anlzMWRoutlier(
  res = NULL,
  param,
  acc = NULL,
  fset = NULL,
  type = c("box", "jitterbox", "jitter"),
  group,
  dtrng = NULL,
  repel = TRUE,
  outliers = FALSE,
  labsize = 3,
  fill = "lightgrey",
  alpha = 0.8,
  width = 0.8,
  yscl = "auto",
  ttlsize = 1.2,
  runchk = TRUE,
  warn = TRUE
)

Arguments

res

character string of path to the results file or data.frame for results returned by readMWRresults

param

character string of the parameter to plot, must conform to entries in the "Simple Parameter" column of paramsMWR

acc

character string of path to the data quality objectives file for accuracy or data.frame returned by readMWRacc

fset

optional list of inputs with elements named res, acc, frecom, sit, or wqx overrides the other arguments

type

character indicating "box", "jitterbox", or "jitter", see details

group

character indicating whether the summaries are grouped by month, site, or week of year

dtrng

character string of length two for the date ranges as YYYY-MM-DD, optional

repel

logical indicating if overlapping outlier labels are offset

outliers

logical indicating if outliers are returned to the console instead of plotting

labsize

numeric indicating font size for the outlier labels

fill

numeric indicating fill color for boxplots

alpha

numeric from 0 to 1 indicating transparency of fill color

width

numeric for width of boxplots

yscl

character indicating one of "auto" (default), "log", or "linear", see details

ttlsize

numeric value indicating font size of the title relative to other text in the plot

runchk

logical to run data checks with checkMWRresults or checkMWRacc, applies only if res or acc are file paths

warn

logical to return warnings to the console (default)

Details

Outliers are defined following the standard ggplot definition as 1.5 times the inter-quartile range of each boxplot. The data frame returned if outliers = TRUE may vary based on the boxplot groupings defined by group.

Specifying type = "box" (default) will produce standard boxplots. Specifying type = "jitterbox" will produce boxplots with non-outlier observations jittered on top. Specifying type = "jitter" will suppress the boxplots and show only the jittered points and the outliers.

Specifying group = "week" will group the samples by week of year using an integer specifying the week. Note that there can be no common month/day indicating the start of the week between years and an integer is the only way to compare summaries if the results data span multiple years.

The y-axis scaling as arithmetic (linear) or logarithmic can be set with the yscl argument. If yscl = "auto" (default), the scaling is determined automatically from the data quality objective file for accuracy, i.e., parameters with "log" in any of the columns are plotted on log10-scale, otherwise arithmetic. Setting yscl = "linear" or yscl = "log" will set the axis as linear or log10-scale, respectively, regardless of the information in the data quality objective file for accuracy.

Any entries in resdat in the "Result Value" column as "BDL" or "AQL" are replaced with appropriate values in the "Quantitation Limit" column, if present, otherwise the "MDL" or "UQL" columns from the data quality objectives file for accuracy are used. Values as "BDL" use one half of the appropriate limit.

Value

A ggplot object that can be further modified if outliers = FALSE, otherwise a data frame of outliers is returned.

Examples

# results data path
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# results data
resdat <- readMWRresults(respth)

# accuracy path
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', 
     package = 'MassWateR')

# accuracy data
accdat <- readMWRacc(accpth)

# outliers by month
anlzMWRoutlier(res = resdat, param = 'DO', acc = accdat, group = 'month')

# outliers by site
anlzMWRoutlier(res = resdat, param = 'DO', acc = accdat, group = 'site')

# outliers by site, May through July 2021 only
anlzMWRoutlier(res = resdat, param = 'DO', acc = accdat, group = 'site', 
     dtrng = c('2022-05-01', '2022-07-31'))

# outliers by month, type as jitterbox
anlzMWRoutlier(res = resdat, param = 'DO', acc = accdat, group = 'month', type = 'jitterbox')

# outliers by month, type as jitter
anlzMWRoutlier(res = resdat, param = 'DO', acc = accdat, group = 'month', type = 'jitter')

# data frame output
anlzMWRoutlier(res = resdat, param = 'DO', acc = accdat, group = 'month', outliers = TRUE)


[Package MassWateR version 2.1.4 Index]