get_lag {hydroroute}R Documentation

Get Lag

Description

Given a data frame (time series) of measurements and a vector of gauging station ID's in order of their location in downstream direction, the lag (the amount of passing time between two gauging stations) is estimated based on the cross-correlation function (ccf) of the time series of two adjacent gauging stations (stats::ccf()). To ensure that the same time period is used for every gauging station, intersecting time steps are determined. These time steps are used to estimate the lags. The result of stats::ccf() is rounded to four decimals before selecting the optimal time lag so that minimal differences are neglected. If there are multiple time steps with the highest correlation, the smallest time step is considered. If the highest correlation corresponds to a zero lag or positive lag (note that the result should usually be negative as measurements at the lower gauge are later recorded as measurements at the upper gauge), a time step of length 1 is selected and a warning message is generated.

Usage

get_lag(
  Q,
  relation,
  steplength = 15,
  lag.max = 20,
  na.action = na.pass,
  mc.cores = getOption("mc.cores", 2L),
  tz = "Etc/GMT-1",
  format = "%Y.%m.%d %H:%M",
  cols = c(1, 2, 3)
)

Arguments

Q

Data frame (time series) of measurements which contains at least a column with the gauging station ID's (default: column index 1), a column with date-time values in character representation (default: column index 2) and a column with flow measurements (default: column index 3). If the column indices differ from c(1, 2, 3), they have to be specified in the cols argument in the format c(i, j, k).

relation

A character vector containing the gauging station ID's in order of their location in downstream direction.

steplength

Numeric value that specifies the length between time steps in minutes (default: 15 minutes). As time steps have to be equispaced, this is used by hydropeak::flow() to get a compatible format and fill missing time steps with NA.

lag.max

Numeric value that specifies the maximum lag at which to calculate the ccf in stats::ccf() (default: 20).

na.action

Function to be called to handle missing values in stats::ccf() (default: na.pass).

mc.cores

Number of cores to use with parallel::mclapply(). On Windows, this is set to 1.

tz

Character string specifying the time zone to be used for internal conversion (default: Etc/GMT-1).

format

Character string giving the date-time format of the date-time column in the input data frame Q. This is passed to hydropeak::flow(), to get a compatible format (default: YYYY.mm.dd HH:MM).

cols

Integer vector specifying column indices in Q. The default indices are 1 (ID), 2 (date-time) and 3 (flow rate, Q). This is passed to hydropeak::flow().

Value

A character vector which contains the estimated cumulative lag between neighboring gauging stations in the format HH:MM.

Examples

Q_path <- system.file("testdata", "Q.csv", package = "hydroroute")
Q <- utils::read.csv(Q_path)

relation_path <- system.file("testdata", "relation.csv",
                            package = "hydroroute")
relation <- utils::read.csv(relation_path)
# from relation data frame
get_lag(Q, relation$ID, format = "%Y-%m-%d %H:%M", tz = "Etc/GMT-1")

# station ID's in downstream direction as vector
relation <- c("100000", "200000", "300000", "400000")
get_lag(Q, relation, format = "%Y-%m-%d %H:%M", tz = "Etc/GMT-1")

[Package hydroroute version 0.1.2 Index]