R: Fit a Tidal Model to a Timeseries

tidem {oce}

R Documentation

Fit a Tidal Model to a Timeseries

Description

The fit is done in terms of sine and cosine components at the indicated tidal frequencies (after possibly pruning to satisfy the Rayleigh criterion, as explained in phase 2 of the procedure outlined in '‘Details’), with the amplitude and phase being calculated from the resultant coefficients on the sine and cosine terms. The scheme was devised for hourly data; for other sampling schemes, see “Application to non-hourly data”.

Usage

tidem(
  t,
  x,
  constituents,
  infer = NULL,
  latitude = NULL,
  rc = 1,
  regress = lm,
  debug = getOption("oceDebug")
)

Arguments

`t`	either a vector of times or a sealevel object (as created with `read.sealevel()` or `as.sealevel()`). In the first case, `x` must be provided. In the second case, though, any `x` that is provided will be ignored, because sealevel objects contain both `time` and water `elevation`, and the latter is used for `x`.
`x`	an optional numerical vector holding something that varies with time. This is ignored if `t` is a sealevel object, because it is inferred automatically, using `t[["elevation"]]`.
`constituents`	an optional character vector holding the names of tidal constituents to which the fit is done; see “Details” and “Constituent Naming Convention”.
`infer`	a list of constituents to be inferred from fitted constituents according to the method outlined in Section 2.3.4 of Foreman (1978). If `infer` is `NULL`, the default, then no such inferences are made. Otherwise, some constituents are computed based on other constituents, instead of being determined by regression at the proper frequency. If provided, `infer` must be a list containing four elements: `name`, a vector of strings naming the constituents to be inferred; `from`, a vector of strings naming the fitted constituents used as the sources for those inferences (these source constituents are added to the regression list, if they are not already there); `amp`, a numerical vector of factors to be applied to the source amplitudes; and `phase`, a numerical vector of angles, in degrees, to be subtracted from the source phases. For example, Following Foreman (1998), if any of the `name` items have already been computed, then the suggested inference is ignored, and the already-computed values are used. infer=list(name=c("P1","K2"), from=c("K1", "S2"), amp=c(0.33093, 0.27215), phase=c(-7.07, -22.4) means that the amplitude of `P1` will be set as 0.33093 times the calculated amplitude of `K1`, and that the `P1` phase will be set to the `K1` phase, minus an offset of `-7.07` degrees. (This example is used in the Foreman (1978) discussion of a Fortran analysis code and also in Pawlowicz et al. (2002) discussion of the T_TIDE Matlab code. Rounded to the 0.1mm resolution of values reported in Foreman (1978) and Pawlowicz et al. (2002), the `tidem` results have root-mean-square amplitude difference to Foreman's (1978) Appendix 7.3 of 0.06mm; by comparison, the results in Table 1 of Pawlowicz et al. (2002) agree with Foreman's results to RMS difference 0.04mm.)
`latitude`	if provided, the latitude of the observations. If not provided, `tidem` will try to infer this from the first argument, if it is a sealevel object.
`rc`	the value of the coefficient in the Rayleigh criterion.
`regress`	function to be used for regression, by default `lm()`, but could be for example `rlm` from the `MASS` package.
`debug`	an integer specifying whether debugging information is to be printed during the processing. This is a general parameter that is used by many `oce` functions. Generally, setting `debug=0` turns off the printing, while higher values suggest that more information be printed. If one function calls another, it usually reduces the value of `debug` first, so that a user can often obtain deeper debugging by specifying higher `debug` values.

Details

A summary of constituents used by tidem() may be found with:

data(tidedata)
print(tidedata$const)

A multi-stage procedure is used to establish the list of tidal constituents to be used in the fit.

Phase 1: initial selection.

The initial list tidal constituents (prior to the pruning phase described below) to be used in the analysis are specified as follows; see “Constituent Naming Convention”.

If constituents is not provided, then the constituent list will be made up of the 69 constituents designated by Foreman as "standard". These include astronomical frequencies and some shallow-water frequencies, and are as follows: c("Z0", "SA", "SSA", "MSM", "MM", "MSF", "MF", "ALP1", "2Q1", "SIG1", "Q1", "RHO1", "O1", "TAU1", "BET1", "NO1", "CHI1", "PI1", "P1", "S1", "K1", "PSI1", "PHI1", "THE1", "J1", "SO1", "OO1", "UPS1", "OQ2", "EPS2", "2N2", "MU2", "N2", "NU2", "GAM2", "H1", "M2", "H2", "MKS2", "LDA2", "L2", "T2", "S2", "R2", "K2", "MSN2", "ETA2", "MO3", "M3", "SO3", "MK3", "SK3", "MN4", "M4", "SN4", "MS4", "MK4", "S4", "SK4", "2MK5", "2SK5", "2MN6", "M6", "2MS6", "2MK6", "2SM6", "MSK6", "3MK7", "M8").
If the first item in constituents is the string "standard", then a provisional list is set up as in Case 1, and then the (optional) rest of the elements of constituents are examined, in order. Each of these constituents is based on the name of a tidal constituent in the Foreman (1978) notation. (To get the list, execute data(tidedata) and then execute cat(tideData$name).) Each named constituent is added to the existing list, if it is not already there. But, if the constituent is preceded by a minus sign, then it is removed from the list (if it is already there). Thus, for example, a user might specify constituents=c("standard", "-M2", "ST32") to remove the M2 constituent and add the ST32 constituent.
If the first item is not "standard", then the list of constituents is processed as in Case 2, but without starting with the standard list. As an example, constituents=c("K1", "M2") would fit for just the K1 and M2 components. (It would be strange to use a minus sign to remove items from the list, but the function allows that.)

In each of the above cases, the list is reordered in frequency prior to the analysis, so that the results of summary,tidem-method() will be in a familiar form.

Phase 2: pruning based on Rayleigh's criterion.

Once the initial constituent list is determined during Phase 1, tidem applies the Rayleigh criterion, which holds that constituents of frequencies f_1 and f_2 cannot be resolved unless the time series spans a time interval of at least rc/(f_1-f_2). The details of the decision of which constituent to prune is fairly complicated, involving decisions based on a comparison table. The details of this process are provided by Foreman (1978).

Phase 3: optionally overruling Rayleigh's criterion.

The pruning list from phase 2 can be overruled by the user. This is done by retaining any constituents that the user has named in the constituents parameter. This action was added on 2017-12-27, to make tidem behave the same way as the Foreman (1978) code, as illustrated in his Appendices 7.2 and 7.3. (As an aside, his Appendix 7.3 has some errors, e.g. the frequency for the 2SK5 constituent is listed there (p58) as 0.20844743, but it is listed as 0.2084474129 in his Appendix 7.1 (p41). For this reason, the frequency comparison is relaxed to a tol value of 1e-7 in a portion of the oce test suite (see tests/testthat/test_tidem.R in the source).

Comments on phases 2 and 3

A specific example may be of help in understanding the removal of unresolvable constituents. For example, the data(sealevel) dataset is of length 6718 hours, and this is too short to resolve the full list of constituents, with the conventional (and, really, necessary) limit of rc=1. From Table 1 of Foreman (1978), this timeseries is too short to resolve the SA constituent, so that SA will not be in the resultant. Similarly, Table 2 of Foreman (1978) dictates the removal of PI1, S1 and PSI1 from the list. And, finally, Table 3 of Foreman (1978) dictates the removal of H1, H2, T2 and R2, and since that document suggests that GAM2 be subsumed into H1, then if H1 is already being deleted, then GAM2 will also be deleted.

Value

An object of tidem, consisting of

`const`	constituent number, e.g. 1 for `Z0`, 1 for `SA`, etc.
`model`	the regression model
`name`	a vector of constituent names, in non-subscript format, e.g. "`M2`".
`frequency`	a vector of constituent frequencies, in inverse hours.
`amplitude`	a vector of fitted constituent amplitudes, in metres.
`phase`	a vector of fitted constituent phase. NOTE: The definition of phase is likely to change as this function evolves. For now, it is phase with respect to the first data sample.
`p`	a vector containing a sort of p value for each constituent. This is calculated as the average of the p values for the sine() and cosine() portions used in fitting; whether it makes any sense is an open question.

Application to non-hourly data

The framework on which tidem() rests on the assumption of data that have been sampled on a 1-hour interval (see e.g. Foreman, 1977). Since regression (as opposed to spectral analysis) is used to infer the amplitude and phase of tidal constituents, data gaps do not pose a serious problem. Sampling intervals under an hour are also not a problem. However, trying to use tidem() on time series that are sampled at uniform intervals that exceed 1 hour can lead to results that are difficult to interpret. For example, some drifter data are sampled at a 6-hour interval. This makes it impossible to fit for the S4 component (which has exactly 6 cycles per day), because the method works by constructing sine and cosine series at tidal frequencies and using these as the basis for regression. Each of these series will have a constant value through the constructed time, and regression cannot handle that (in addition to a constant-value constructed series that is used to fit for the Z0 constituent). tidem() tries to handle such problems by examining the range of the constructed sine and cosine time-series, omitting any constituents that yield near-constant values in either of these. Messages are issued if this problem is encountered. This prevents failure of the regression, and the predictions of the regression seem to represent the data reasonably well, but the inferred constituent amplitudes are not physically reasonable. Cautious use of tidem() to infer individual constituents might be warranted, but users must be aware that the results will be difficult to interpret. The tool is simply not designed for this use.

Bugs

This function is not fully developed yet, and both the form of the call and the results of the calculation may change.
The reported p value may make no sense at all, and it might be removed in a future version of this function. Perhaps a significance level should be presented, as in the software developed by both Foreman and Pawlowicz.

Constituent Naming Convention

tidem uses constituent names that follow the convention set by Foreman (1978). This convention is slightly different from that used in the T-TIDE package of Pawlowicz et al. (2002), with Foreman's UPS1 and M8 becoming UPSI and MS in T-TIDE. To permit the use of either notation, tidem() uses tidemConstituentNameFix() to convert from T-TIDE names to the Foreman names, issuing warnings when doing so.

Agreement with `T_TIDE` results

The tidem amplitude and phase results, obtained with

tidem(sealevelTuktoyaktuk, constituents=c("standard", "M10"),
    infer=list(name=c("P1", "K2"),
        from=c("K1", "S2"),
        amp=c(0.33093, 0.27215),
        phase=c(-7.07, -22.40)))

match the T_TIDE values listed in Table 1 of Pawlowicz et al. (2002), after rounding amplitude and phase to 4 and 2 digits past the decimal place, respectively, to match the format of that table.

Author(s)

Dan Kelley

References

Foreman, M G., 1977 (revised 1996). Manual for Tidal Heights Analysis and Prediction. Pacific Marine Science Report 77-10. British Columbia, Canada: Institute of Ocean Sciences, Patricia Bay.

Foreman, M. G. G., 1978. Manual for Tidal Currents Analysis and Prediction. Pacific Marine Science Report 78-6. British Columbia, Canada: Institute of Ocean Sciences, Patricia Bay,

Foreman, M. G. G., Neufeld, E. T., 1991. Harmonic tidal analyses of long time series. International Hydrographic Review, 68 (1), 95-108.

Leffler, K. E. and D. A. Jay, 2009. Enhancing tidal harmonic analysis: Robust (hybrid) solutions. Continental Shelf Research, 29(1):78-88.

Pawlowicz, Rich, Bob Beardsley, and Steve Lentz, 2002. Classical tidal harmonic analysis including error estimates in MATLAB using T_TIDE. Computers and Geosciences, 28, 929-937.

Examples

library(oce)
# The demonstration time series from Foreman (1978),
# also used in T_TIDE (Pawlowicz, 2002).
data(sealevelTuktoyaktuk)
tide <- tidem(sealevelTuktoyaktuk)
summary(tide)

# AIC analysis
extractAIC(tide[["model"]])

# Fake data at M2
library(oce)
data("tidedata")
M2 <- with(tidedata$const, freq[name == "M2"])
t <- seq(0, 10 * 86400, 3600)
eta <- sin(M2 * t * 2 * pi / 3600)
sl <- as.sealevel(eta)
m <- tidem(sl)
summary(m)

[Package oce version 1.8-2 Index]