impute_data {realTimeloads}R Documentation

Returns x with gaps imputed using ARIMA and Decision Trees, optional uncertainty estimation using Monte Carlo resampling

Description

Returns x with gaps imputed using ARIMA and Decision Trees with option to use harmonic model as predictors for x in decision tree algorithm. Uncertainty on imputed data is estimated using using Monte Carlo (MC) resampling adapting methods of Rustomji and Wilkinson (2008)

Usage

impute_data(
  time,
  x,
  Xreg = NULL,
  ti = NULL,
  hfit = NULL,
  harmonic = FALSE,
  only_use_Xreg = FALSE,
  MC = 1,
  ptrain = 1
)

Arguments

time

time for x (time, POSIXct)

x

any quantity (double)

Xreg

additional predictors for decision tree, required if harmonic is FALSE (rows = time, or if given, ti)

ti

time vector for interpolation (time, POSIXct)

hfit

model object from TideHarmonics::ftide

harmonic

logical if x exhibits tidal or diurnal variability

only_use_Xreg

logical for using Xreg only in decision tree

MC

number of Monte Carlo simulations for uncertainty estimation

ptrain

proportion of data used for training and testing model

Value

list with x imputed at time or ti, if given. Uncertainty estimated from Monte Carlo simulations

Note

If MC == 1, uncertainty is not evaluated. If ptrain == 1, uncertainty and validation accuracy are not computed

Author(s)

Daniel Livsey (2023) ORCID: 0000-0002-2028-6128

References

Rustomji, P., & Wilkinson, S. N. (2008). Applying bootstrap resampling to quantify uncertainty in fluvial suspended sediment loads estimated using rating curves. Water resources research, 44(9).

van Buuren S, Groothuis-Oudshoorn K (2011). “mice: Multivariate Imputation by Chained Equations in R.” Journal of Statistical Software, 45(3), 1-67. doi:10.18637/jss.v045.i03.

Stephenson AG (2016). Harmonic Analysis of Tides Using TideHarmonics. https://CRAN.R-project.org/package=TideHarmonics.

Moritz S, Bartz-Beielstein T (2017). “imputeTS: Time Series Missing Value Imputation in R.” The R Journal, 9(1), 207–218. doi:10.32614/RJ-2017-009.

Examples

# Impute non-tidal data
time <- realTimeloads::ExampleData$Sediment_Samples$time
xo <- realTimeloads::ExampleData$Sediment_Samples$SSCxs_mg_per_liter
Q <- realTimeloads::ExampleData$Discharge$Discharge_m_cubed_per_s
idata <- sample(1:length(xo),round(length(xo)*0.5),replace=FALSE)
x <- rep(NA,length(xo))
x[idata] <- xo[idata] # simulated samples
flow_concentrtion_ratio <- imputeTS::na_interpolation(Q/x)
Xreg <- cbind(Q,flow_concentrtion_ratio)
Output <- impute_data(time,x,Xreg,MC = 10,ptrain = 0.8)

# Impute tidal data
time <-TideHarmonics::Portland$DateTime[1:(24*90)]
xo <-TideHarmonics::Portland$SeaLevel[1:(24*90)]
idata <- sample(1:length(xo),round(length(xo)*0.5),replace=FALSE)
x <- rep(NA,length(xo))
x[idata] <- xo[idata] # simulated samples
Output <- impute_data(time,x,harmonic = TRUE,MC = 10,ptrain = 0.8)

[Package realTimeloads version 1.0.0 Index]