R: Parametric Modelling of Soil Hydraulic Properties

fit_wrc_hcc {soilhypfit}

R Documentation

Parametric Modelling of Soil Hydraulic Properties

Description

The function fit_wrc_hcc estimates parameters of models for the soil water retention curve and/or soil hydraulic conductivity function from respective measurements by nonlinear regression methods, optionally subject to physical constraints on the estimated parameters. fit_wrc_hcc uses optimisation algorithms of the NLopt library (Johnson, see nloptr-package) and the Shuffled Complex Evolution (SCE) algorithm (Duan et al., 1994) implemented in the function SCEoptim.

Usage


fit_wrc_hcc(
    wrc_formula, hcc_formula, data,
    subset = NULL, wrc_subset = subset, hcc_subset = subset,
    weights = NULL, wrc_weights = weights, hcc_weights = weights,
    na.action, param = NULL, lower_param = NULL, upper_param = NULL,
    fit_param = default_fit_param(),
    e0 = 2.5e-3, ratio_lc_lt_bound = c(lower = 0.5, upper = 2),
    control = control_fit_wrc_hcc(), verbose = 0)

default_fit_param(
    alpha = TRUE, n = TRUE, tau = FALSE,
    thetar = TRUE, thetas = TRUE, k0 = TRUE)

Arguments

`wrc_formula`	an optional two-sided formula such as `wc ~ head` or `wc ~ head \| id`, specifying the variables for the water content (`wc`), the capillary pressure `head` and, optionally, for sample `id`s when model parameters are estimated for several soil samples at the same time, see `formula` and Details.
`hcc_formula`	an optional two-sided formula such as `hc ~ head` or `hc ~ head \| id`, specifying the variables for the hydraulic conductivity (`hc`), the capillary pressure `head` and, optionally, for sample `id`s when model parameters are estimated for several soil samples at the same time. See `formula` and Details.
`data`	a mandatory data frame containing the variables specified in the formula, the subset and weights arguments.
`subset`	an optional expression generating a vector to choose a subset of water content and hydraulic conductivity data. The expression is evaluated in the environment generated by `model.frame(wrc_formula, data)` and `model.frame(hcc_formula, data)`, respectively.
`wrc_subset`	an optional expression generating a vector to choose a subset of water content data. The expression is evaluated in the environment generated by `model.frame(wrc_formula, data)`. Defaults to `subset`.
`hcc_subset`	an optional expression generating a vector to choose a subset of hydraulic conductivity data. The expression is evaluated in the environment generated by `model.frame(hcc_formula, data)`. Defaults to `subset`.
`weights`	an optional expression generating a numeric vector of case weights `w^\prime_{\theta,i}` and `w^\prime_{K,i}` (default: 1, see `soilhypfitIntro`) for water content and hydraulic conductivity data. The expression is evaluated in the environment generated by `model.frame(wrc_formula, data)` and `model.frame(hcc_formula, data)`, respectively.
`wrc_weights`	an optional expression generating a numeric vector of case weights `w^\prime_{\theta,i}` (see `soilhypfitIntro`) for water content data. The expression is evaluated in the environment generated by `model.frame(wrc_formula, data)`. Defaults to `weights`.
`hcc_weights`	an optional expression generating a numeric vector of case weights `w^\prime_{K,i}` (see `soilhypfitIntro`) for hydraulic conductivity data. The expression is evaluated in the environment generated by `model.frame(hcc_formula, data)`. Defaults to `weights`.
`na.action`	a function which indicates what should happen when the data contain `NA`s. The default is set by the `na.action` argument of `options`, and is `na.fail` if that is unset. The “factory-fresh” default is `na.omit`. Another possible value is `NULL`, no action. Value `na.exclude` can be useful.
`param`	an optional named numeric vector (or a numeric matrix or a dataframe with specified row and column names, see Details) with initial values for the model parameters. Currently, `param` may have elements (or columns) named `"alpha"`, `"n"`, `"tau"`, `"thetar"`, `"thetas"`, `"k0"`, see `wc_model` and `hc_model` and Details. For consistency with other quantities, the unit of `\alpha` should be 1/meter [`\mathrm{m}^{-1}`] and the unit of `K_0` meter/day [`\mathrm{m}\,\mathrm{d}^{-1}`].
`lower_param`	an optional named numeric vector (or a numeric matrix or a dataframe with specified row and column names, see Details) with lower bounds for the parameters of the models. Currently, `lower_param` may have elements (or columns) named `"alpha"`, `"n"`, `"tau"`, `"thetar"`, `"thetas"`, `"k0"`, see `wc_model`, `hc_model` and `param_boundf`. For consistency with other quantities, the unit of `\alpha` should be 1/meter [`\mathrm{m}^{-1}`] and the unit of `K_0` meter/day [`\mathrm{m}\,\mathrm{d}^{-1}`]. If lower bounds are specified for `\theta_r` but not for `\theta_s` then the lower bounds specified for `\theta_r` will also be used for `\theta_s`.
`upper_param`	an optional named numeric vector (or a numeric matrix or a dataframe with specified row and column names, see Details) with upper bounds for the parameters of the models. Currently, `upper_param` may have elements (or columns) named `"alpha"`, `"n"`, `"tau"`, `"thetar"`, `"thetas"`, `"k0"`, see `wc_model`, `hc_model` and `param_boundf`. For consistency with other quantities, the unit of `\alpha` should be 1/meter [`\mathrm{m}^{-1}`] and the unit of `K_0` meter/day [`\mathrm{m}\,\mathrm{d}^{-1}`]. If upper bounds are specified for `\theta_s` but not for `\theta_r` then the upper bounds specified for `\theta_s` will also be used for `\theta_r`.
`fit_param`	a named logical vector (or a logical matrix or a dataframe with specified row and column names, see Details) containing flags that control whether model parameters are estimated (`TRUE`) or kept fixed at the initial values (`FALSE`) as specified in `param`. This vector can be generated easily by the function `default_fit_param`. Currently, `fit_param` may have elements (or columns) named `"alpha"`, `"n"`, `"tau"`, `"thetar"`, `"thetas"`, `"k0"`, see Details, `wc_model` and `hc_model`.
`e0`	a numeric scalar (or named vector, see Details) with the stage-I rate of evaporation `E_0` from a soil (default `2.5\cdot 10^{-3} \ \mathrm{m}\,\mathrm{d}^{-1}`) required to evaluate the characteristic evaporative length, see `evaporative-length` and `soilhypfitIntro`. For consistency with other quantities, the unit of `E_0` should meter/day [`\mathrm{m}\,\mathrm{d}^{-1}`]. Note that `e0` is ignored when an unconstrained nonlinear optimisation algorithm is chosen (argument `settings` of `control_fit_wrc_hcc` equal to `"uglobal"`, `"sce"` or `"ulocal"`).
`ratio_lc_lt_bound`	a named numeric vector of length 2 (or a matrix with specified rownames and two columns, see Details) defining the default `lower` and `upper` bounds of the ratio `L_\mathrm{c}/L_\mathrm{t}` (Lehmann et al., 2008, 2020) for constrained estimation of the nonlinear parameters `\boldsymbol{\nu}^\mathrm{T} =(\alpha, n, \tau)` (default values 0.5 (`lower`) and 2 (`upper`)), see `evaporative-length` and `soilhypfitIntro`. Note that `ratio_lc_lt_bound` is ignored when an unconstrained nonlinear optimisation algorithm is chosen (argument `settings` of `control_fit_wrc_hcc` equal to `"uglobal"`, `"sce"` or `"ulocal"`).
`control`	a list with options to control `fit_wrc_hcc` or a function such as `control_fit_wrc_hcc` that generates such a list. The main argument `settings` of `control_fit_wrc_hcc` selects a nonlinear optimisation approach, see `soilhypfitIntro` and `control_fit_wrc_hcc` for full details.
`verbose`	positive integer controlling logging of diagnostic messages to the console and plotting of data and model curves during fitting. `verbose < 0` suppresses all output, `verbose >= 0` suppresses all output except warning messages, `verbose >= 1` displays at the end the measurements along with the fitted model curves, `verbose >= 2` prints for each iteration the parameters values, the value of the objective function and the ratio of `L_\mathrm{c}/L_\mathrm{t}` (see `evaporative-length`), `verbose >= 3` plots for each iteration the measurements along with the fitted model curves. Logging of further diagnostics during fitting is controlled in addition by the arguments `print_level`, `check_derivatives`, `check_derivatives_print` when `nloptr` is used (see `nloptr.print.options` and `control_nloptr`) and by the argument `trace` of `SCEoptim` (see `control_sce`).
`alpha`	a logical scalar controlling whether the inverse air entry pressure `\alpha` should be estimated (`TRUE`, default) or kept fixed at the initial value (`FALSE`) specified by `param`, see `wc_model` and `hc_model`.
`n`	a logical scalar controlling whether the shape parameter `n` should be estimated (`TRUE`, default) or kept fixed at the initial value (`FALSE`) specified by `param`, see `wc_model` and `hc_model`.
`tau`	a logical scalar controlling whether the tortuosity parameter `\tau` should be estimated (`TRUE`) or kept fixed at the initial value (`FALSE`, default) specified by `param`, see `hc_model`.
`thetar`	a logical scalar controlling whether the residual water content `\theta_r` should be estimated (`TRUE`, default) or kept fixed at the initial value (`FALSE`) specified by `param`, see `wc_model`.
`thetas`	a logical scalar controlling whether the saturated water content `\theta_s` should be estimated (`TRUE`, default) or kept fixed at the initial value (`FALSE`) specified by `param`, see `wc_model`.
`k0`	a logical scalar controlling whether the saturated hydraulic conductivity `K_0` should be estimated (`TRUE`, default) or kept fixed at the initial value (`FALSE`) specified by `param`, see `hc_model`.

Details

Estimating model parameters of water retention curves and/or hydraulic conductivity functions

The function fit_wrc_hcc estimates model parameters of water retention curves and/or hydraulic conductivity functions by maximum likelihood (ml, default) maximum posterior density (mpd) or nonlinear least squares (wls), see argument method of control_fit_wrc_hcc. Measurements must be available for at least one of the two material functions. If one type of data is missing then the respective formula, subset and weights arguments should be omitted.

If both types of data are available then measurements are weighed when computing the residual sum of squares for wls, but unit weights are used by default for mpd and ml estimation, see soilhypfitIntro. The wls weights are the product of the case weights w^\prime_{\theta,i} and w^\prime_{K,i} given by weights, wrc_weights and hcc_weights, respectively, and the variable weights as specified by the argument variable_weight of control_fit_wrc_hcc. Note that the variable_weights are not used “as is”, but they are multiplied by the inverse variances of the water content or the (log-transformed) conductivity data per sample to obtain the variable weights w_\theta, w_K mentioned in soilhypfitIntro.

Estimating model parameters for a single or multiple soil samples

When parameters are estimated for a single soil sample only, then model formulae are of the form wc ~ head and/or hc ~head, and there is no need to specify a sample id. In this case param, lower_param, upper_param, fit_param and ratio_lc_lt_bound are vectors and e0 is a scalar.

fit_wrc_hcc allows to fit models to datasets containing data for multiple soil samples. The model formula must then be of the form wc ~ head|id and/or hc ~ head|id where the factor id identifies the soil samples. If param, lower_param, upper_param, fit_param and ratio_lc_lt_bound are vectors and e0 is a scalar then this information is used for processing all the soil samples. If specific information per sample should be used then param, lower_param, upper_param, fit_param and ratio_lc_lt_bound must be matrices (or data frames) with as many rows as there are soil samples. The matrices (or data.frames) must have rownames matching the levels of the factor id defining the soil samples. e0 must be a named vector with as many elements as there are soil samples and the names must again match the levels of id.

By default, fit_wrc_hcc processes data of multiple soil samples in parallel, see control_pcmp for options to control parallel computing. Note that diagnostic output (except warning messages) is suppressed in parallel computations. If computations fail for a particular soil sample, the id of the sample with the failed fit can be extracted by the utility function select_failed_fits and the respective error messages by extract_error_messages.

Controlling `fit_wrc_hcc`

The argument control is used to pass a list of options to fit_wrc_hcc that steer the function, see soilhypfit-package and control_fit_wrc_hcc for full details.

Value

fit_wrc_hcc returns an object of class fit_wrc_hcc, which is a list with the following components:

`fit`	a list with as many components as there are soils samples. Each component is in turn a list that contains the estimated parameters and accompanying information: `converged`: an integer code issued by `SCEoptim` or `nloptr` on (non-) convergence, see `convergence_message` and NLopt return values. `message`: a character string issued by `SCEoptim` or `nloptr` on (non-)convergence. `evaluation`: the number of evaluations of the objective function and of the gradient. `objective`: the value of the objective function evaluated at the solution. The contributions of the water retention curve and the hydraulic conductivity function to `objective` are reported as attributes `obj_wc` and `obj_hc`. The attributes `ssq_wc` and `ssq_hc` are the respective residual sums of squares `Q_\theta` and `Q_K`, see `soilhypfitIntro`. `gradient`: the gradient of the objective function at the solution with respect to the (possibly transformed) nonlinear parameters. `lp`: the estimated values for the linear parameters `\boldsymbol{\mu}^\mathrm{T} =(\theta_r, \theta_s, K_0)`, see `wc_model` and `hc_model`. `nlp`: the estimated values for the nonlinear parameters `\boldsymbol{\nu}^\mathrm{T} =(\alpha, n, \tau)`, see `wc_model` and `hc_model`. `inequality_constraint`: a list with the values of the inequality constraints evaluated at the solutions. Currently, values are reported for the expressions. `-(\frac{L_\mathrm{c}}{L_\mathrm{t}} - l)` `\frac{L_\mathrm{c}}{L_\mathrm{t}} - u` in the only component `lc`, where `l` and `u` are the lower and upper bounds of the ratio, see argument `ratio_lc_lt_bound` and `evaporative-length`. The values of `L_\mathrm{c}`, `L_\mathrm{t}`, the imposed bounds on their ratio and `e0` are returned as attributes of `lc`. `hessian`: optionally the Hessian matrix of the objective function with respect to the possibly transformed nonlinear parameters at the solution. `variable_weight`: a named vector with the variable weights `w_{\theta}` and `w_{K}`, see Details. `weight`: a list with one or two components named `weights_wc` and or `weights_hc` with the case weights `w_{\theta,i}` and `w_{K,i}` used in the objective function, see Details. `fitted`: a list with one or two components named `wrc` and/or `hcc` with the fitted values for the water retention and the (possibly log-transformed) hydraulic conductivity data. `residuals`: a list with one or two components named `wrc` and/or `hcc` with residuals for the water retention and the (possibly log-transformed) hydraulic conductivity data. `model`: a list with one or two components named `wrc` and/or `hcc` with the `model.frame`s for the water retention and hydraulic conductivity data. `initial_objects`: a list with the values for `param`, `fit_param`, `variable_weight`, `param_bound`, `e0` and `ratio_lc_lt_bound` as taken from the (processed) arguments of the call of `fit_wrc_hcc`.
`sample_id_variable`	the name of the variable defining the samples.
`wrc`	logical scalar signalling whether water retention data were used to estimate the parameters.
`wrc_formula`	formula defining the variables for water content and hydraulic head of the water retention data (`NULL` if not `wrc` equal to `FALSE`).
`wrc_mf`	unevaluated call of `model.frame` to build the modelframe for the water retention data (`NULL` if `wrc` equal to `FALSE`).
`wrc`	logical scalar signalling whether water retention data were used to estimate the parameters.
`hcc_formula`	formula defining the variables for hydraulic conductivity and hydraulic head of the hydraulic conductivity data (`NULL` if not `hcc` equal to `FALSE`).
`hcc_mf`	unevaluated call of `model.frame` to build the modelframe for the water retention data (`NULL` if `hcc` equal to `FALSE`).
`control`	a list with the options used to control `fit_wrc_hcc`, see `control_fit_wrc_hcc`.
`call`	the matched call.
`na.action`	information returned by `model.frame` on the special handling of `NA`s.

Author(s)

Andreas Papritz papritz@retired.ethz.ch.

References

Duan, Q., Sorooshian, S., and Gupta, V. K. (1994) Optimal use of the SCE-UA global optimisation method for calibrating watershed models, Journal of Hydrology 158, 265–284, doi:10.1016/0022-1694(94)90057-4.

Johnson, S.G. The NLopt nonlinear-optimisation package. https://github.com/stevengj/nlopt.

Lehmann, P., Assouline, S., Or, D. (2008) Characteristic lengths affecting evaporative drying of porous media. Physical Review E, 77, 056309, doi:10.1103/PhysRevE.77.056309.

Lehmann, P., Bickel, S., Wei, Z., Or, D. (2020) Physical Constraints for Improved Soil Hydraulic Parameter Estimation by Pedotransfer Functions. Water Resources Research 56, e2019WR025963, doi:10.1029/2019WR025963.

Examples


# use of \donttest{} because execution time exceeds 5 seconds

data(sim_wrc_hcc)

# define number of cores for parallel computations
if(interactive()) ncpu <- parallel::detectCores() - 1L else ncpu <- 1L


# estimate parameters for single sample ...

#  ... from wrc and hcc data
plot(rfit_wrc_hcc <- fit_wrc_hcc(
  wrc_formula = wc ~ head, hcc_formula = hc ~ head,
  data = sim_wrc_hcc, subset = id == 1))
coef(rfit_wrc_hcc, gof = TRUE)

#  ... from wrc data
plot(rfit_wrc <- fit_wrc_hcc(
  wrc_formula = wc ~ head,
  data = sim_wrc_hcc, subset = id == 1))
coef(rfit_wrc, gof = TRUE)

#  ... from  hcc data
plot(rfit_hcc <- fit_wrc_hcc(
  hcc_formula = hc ~ head,
  data = sim_wrc_hcc, subset = id == 1))
coef(rfit_hcc, gof = TRUE)

# ... from wrc and hcc data
#     keeping some parameters fixed
plot(rfit_wrc_hcc_fixed <- fit_wrc_hcc(
  wrc_formula = wc ~ head, hcc_formula = hc ~ head,
  data = sim_wrc_hcc, subset = id == 1,
  param = c(alpha = 2.1, thetar = 0.1),
  fit_param = default_fit_param(alpha = FALSE, thetar = FALSE)),
  y = rfit_wrc_hcc)
coef(rfit_wrc_hcc, gof = TRUE)
coef(rfit_wrc_hcc_fixed, gof = TRUE)


# estimate parameters for 3 samples simultaneously by ...

# ... unconstrained, global optimisation algorithm NLOPT_GN_MLSL (default)
rfit_uglob <- fit_wrc_hcc(
  wrc_formula = wc ~ head | id, hcc_formula = hc ~ head | id,
  data = sim_wrc_hcc,
  control = control_fit_wrc_hcc(pcmp = control_pcmp(ncores = ncpu)))
summary(rfit_uglob)
op <- par(mfrow = c(3, 2))
plot(rfit_uglob)
on.exit(par(op))

# ... unconstrained, global optimisation algorithm SCEoptim
rfit_sce <- update(
  rfit_uglob,
  control = control_fit_wrc_hcc(
    settings = "sce", pcmp = control_pcmp(ncores = ncpu)))
coef(rfit_sce, gof = TRUE, lc = TRUE)
convergence_message(2, sce = TRUE)
op <- par(mfrow = c(3, 2))
plot(rfit_sce, y = rfit_uglob)
on.exit(par(op))

# ... unconstrained, local optimisation algorithm NLOPT_LN_BOBYQA,
#     logging iteration results to console
rfit_uloc <- update(
  rfit_uglob,
  param = as.matrix(coef(rfit_uglob, what = "nonlinear")),
  control = control_fit_wrc_hcc(
    settings = "ulocal", pcmp = control_pcmp(ncores = 1L)),
  verbose = 2)
coef(rfit_uloc, gof = TRUE, lc = TRUE)

# ... constrained, global optimisation algorithm NLOPT_GN_ISRES
rfit_cglob <- update(
  rfit_uglob,
  ratio_lc_lt_bound = c(lower = 0.8, upper = 1.2),
  control = control_fit_wrc_hcc(
    settings = "cglobal", nloptr = control_nloptr(ranseed = 1),
    pcmp = control_pcmp(ncores = ncpu)))
coef(rfit_cglob, gof = TRUE, lc = TRUE)

# ... constrained, local optimisation algorithm NLOPT_LD_CCSAQ
#     starting from unconstrained, locally fitted initial values
rfit_cloc_1 <- update(
  rfit_uglob,
  param = coef(rfit_uloc, what = "nonlinear"),
  ratio_lc_lt_bound = c(lower = 0.8, upper = 1.2),
  control = control_fit_wrc_hcc(
    settings = "clocal", pcmp = control_pcmp(ncores = ncpu)))
coef(rfit_cloc_1, gof = TRUE, lc = TRUE)
op <- par(mfrow = c(3, 2))
plot(x = rfit_uloc, y = rfit_cloc_1)
on.exit(par(op))

# ... constrained, local optimisation algorithm NLOPT_LD_CCSAQ
#     starting from constrained, globally fitted initial values,
#     using sample-specific evaporation rates and limits for ratio lc/lt
rfit_cloc_2 <- update(
  rfit_uglob,
  param = as.matrix(coef(rfit_cglob, what = "nonlinear")),
  e0 = c("1" = 0.002, "2" = 0.0025, "3" = 0.003),
  ratio_lc_lt_bound = rbind(
    "1" = c(lower = 0.7, upper = 2),
    "2" = c(lower = 0.8, upper = 1.4),
    "3" = c(lower = 0.8, upper = 2)
  ),
  control = control_fit_wrc_hcc(
    settings = "clocal", pcmp = control_pcmp(ncores = ncpu)))
coef(rfit_cloc_2, gof = TRUE, lc = TRUE, e0 = TRUE)

# ... global optimisation algorithm NLOPT_GN_MLSL
#     with sample-specific box-constraints

rfit_uglob_bc <- update(
  rfit_uglob,
  lower_param = rbind(
    "1" = c(alpha = 2.4, n = 1.11, thetar = 0.2, thetas = 0.6),
    "2" = c(alpha = 1.2, n = 1.12, thetar = 0.2, thetas = 0.6),
    "3" = c(alpha = 1.2, n = 1.13, thetar = 0.2, thetas = 0.6)
  ),
  upper_param = rbind(
    "1" = c(alpha = 20.1, n = 2.51, thetar = 0.6, thetas = 0.6),
    "2" = c(alpha = 20.2, n = 1.5,  thetar = 0.6, thetas = 0.6),
    "3" = c(alpha = 1.3, n = 2.53,  thetar = 0.6, thetas = 0.6)
  )
)
coef(rfit_uglob, gof = TRUE)
coef(rfit_uglob_bc, gof = TRUE)

[Package soilhypfit version 0.1-7 Index]