rwd_agg {RWDataPlyr} | R Documentation |
Class to specify how to aggregate RiverWare data
Description
rwd_agg()
creates a RiverWare data aggregator (rwd_agg
) object, which
lets users specify how specific RiverWare slots should be aggregated.
Usage
rwd_agg(x = NULL, rdfs = NULL)
Arguments
x |
A data.frame with required column names and valid entries; see Details and Aggregation Specification sections. |
rdfs |
A vector of rdf names; see Details and Aggregation Specification sections. |
Details
rwd_agg
objects can be created in three ways:
By providing a data.frame, with the following expected columns
file
,slot
,period
,summary
,eval
,t_s
, andvariable
. Each row in the data.frame should include all of the information for how each individual slot will be aggregated. See the Aggregation Specification section below for details on how to specify each column.By providing a vector of rdf files. If specified in this manor, all of the slots in each rdf file will be read in to a
rwtbl
, but will not be aggregated/summarized.In this case, the
variable
names are automatically constructed from theObjectSlot
names. The variable names are constructed as the all lower case version of the object_slot name. If the slot name is "Pool Elevation", it is shortened to "pe", otherwise the full object and slot name are used. If there are any spaces, they are replaced with underscores.By reading in a csv file with
read_rwd_agg()
. This csv file must have the correct column names and meet other requirements described below. To ensure it has the correct column names,rwd_agg_template()
can be used to create a blank csv file for the user to fill in.
Aggregation Specification
In order to specify how each slot should be aggregated, each column should
include specific keywords, which are described below. It is up to the user
to specify which rdf file contains each slot. In a general case, the user
specifies the slot
that is found in a specific rdf file (file
). A
summary
function is applied to a subset period
of the slot
, and then
compared (eval
) to a threshold (t_s
) and saved as the variable
.
-
file: specifies the rdf file that contains the slot.
-
slot: the full RiverWare slot name, i.e., "Object.Slot".
-
period: the period that the slot should be summarized over. This should either be a function name, a full month name (found in month.name), or the keyword "asis".
-
function name: Specifying a function name allows for pre-specified or custom functions to group together several months in the period. This package provides the following functions:
cy()
,wy()
,eocy()
, andeowy()
.cy()
indicates the data will be summarized over the calendar year, i.e., January - December, whilewy()
summarizes over the water year, i.e., October - September.eocy()
selects the end of the calendar year, andeowy()
selects the end of the water year. When specified in theslot_agg
object, leave off the parenthesis, i.e., only specify the function name. Ifwy()
is specified, the function will remove data for any water years that have less than 7 months of data. This "tolerance" is specified by therwdataplyr.wy_month_tol
option, and can be modified by updating this option to another number. For standard monthly data that starts in January and ends in December, this results in keeping the first water year, since it includes 9 months of data, and removing the final water year, since it includes only three months of data. Setting this option to 0 will result in keeping any water year data that has at least one month of data; setting this option to 11, ensures that there must be a full water year of data for that year to be kept.This can also be a user specified custom function; see the Custom Period Functions section for details on constructing the custom functions.
-
full month name: When the full month name is specified, data will be filtered to only include data for that particular month. To select multiple months of data, use a function as described above. If the month specified is not found in month.name, an error will occur.
-
asis: If the keyword "asis" is specified, the data is returned for its native timestep, i.e, monthly data will return monthly data and annual data will return annual.
-
-
summary: the summary function that should be applied to the period specified as a function name, or
NA
. If theperiod
specified is "asis" or returns only one month, e.g.,eocy()
, then the summary should beNA
. The summary function should only return one value; for that reason, most of theSummary
S4groupGenerics work. Notably,range()
will not since it returns two values. There is no reason that a custom function will not work here, but it has not been tested. -
eval: the comparison operator to use (see the
Compare
S4groupGenerics). If no comparison is desired, thenNA
should be used. Ifeval
is specified the value returned from applying thesummary
to theperiod
will be compared to the threshold specified byt_s
. The results of the comparison are returned as 0 and 1 instead ofTRUE
andFALSE
. -
t_s: either the threshold to be compared to if
eval
is notNA
or a value to scale the result by, e.g,. 0.001 to convert from acre-ft to thousand acre-ft.NA
can also be specified to not scale the data. -
variable: the variable name that will be used to identify the results of applying the period, summary, comparison/scaling to. All variable names should be unique.
For example, to determine if the minimum water year elevation at Lake Powell is below elevation 3550 feet, the following would be specified:
data.frame( file = "KeySlots.rdf", slot = "Powell.Pool Elevation", period = "wy", summary = "min", eval = "<", t_s = 3550, variable = "powellLt3550", stringsAsFactors = FALSE )
Custom Period Functions
Users can specify custom period functions to make it easier to group months
together in custom ways. For example a function could return all of the
summer months, or the more complicated case groups months across different
calendar years together. In fact, wy()
is an example of a function that
does this; another example might be grouping December - February together
for winter months.
The custom period function should return a list with three elements:
-
fun
- a function that will modify a rwtbl and properly determine the newYear
s based on the custom period. -
filter_months
- the months that should be grouped together. -
group_tbl
- how to group the returned rwtbl; likely eitherc("Year")
orc("Year", "Month")
See the "RWDataPlyr Workflow" vignette for example implementations of both the summer and winter custom functions described above.
See Also
rwd_agg_template()
, read_rwd_agg()
Examples
# determine if Powell's minimum water year elevation is < 3550'
rwd_agg(
data.frame(
file = "KeySlots.rdf",
slot = "Powell.Pool Elevation",
period = "wy",
summary = "min",
eval = "<",
t_s = 3550,
variable = "powellLt3550",
stringsAsFactors = FALSE
)
)
# get all the monthly slots in KeySlots.rdf
rwd_agg(rdfs = "KeySlots.rdf")