fit_wrc_hcc {soilhypfit}R Documentation

Parametric Modelling of Soil Hydraulic Properties

Description

The function fit_wrc_hcc estimates parameters of models for the soil water retention curve and/or soil hydraulic conductivity function from respective measurements by nonlinear regression methods, optionally subject to physical constraints on the estimated parameters. fit_wrc_hcc uses optimisation algorithms of the NLopt library (Johnson, see nloptr-package) and the Shuffled Complex Evolution (SCE) algorithm (Duan et al., 1994) implemented in the function SCEoptim.

Usage


fit_wrc_hcc(
    wrc_formula, hcc_formula, data,
    subset = NULL, wrc_subset = subset, hcc_subset = subset,
    weights = NULL, wrc_weights = weights, hcc_weights = weights,
    na.action, param = NULL, lower_param = NULL, upper_param = NULL,
    fit_param = default_fit_param(),
    e0 = 2.5e-3, ratio_lc_lt_bound = c(lower = 0.5, upper = 2),
    control = control_fit_wrc_hcc(), verbose = 0)

default_fit_param(
    alpha = TRUE, n = TRUE, tau = FALSE,
    thetar = TRUE, thetas = TRUE, k0 = TRUE)

Arguments

wrc_formula

an optional two-sided formula such as wc ~ head or wc ~ head | id, specifying the variables for the water content (wc), the capillary pressure head and, optionally, for sample ids when model parameters are estimated for several soil samples at the same time, see formula and Details.

hcc_formula

an optional two-sided formula such as hc ~ head or hc ~ head | id, specifying the variables for the hydraulic conductivity (hc), the capillary pressure head and, optionally, for sample ids when model parameters are estimated for several soil samples at the same time. See formula and Details.

data

a mandatory data frame containing the variables specified in the formula, the subset and weights arguments.

subset

an optional expression generating a vector to choose a subset of water content and hydraulic conductivity data. The expression is evaluated in the environment generated by model.frame(wrc_formula, data) and
model.frame(hcc_formula, data), respectively.

wrc_subset

an optional expression generating a vector to choose a subset of water content data. The expression is evaluated in the environment generated by
model.frame(wrc_formula, data). Defaults to subset.

hcc_subset

an optional expression generating a vector to choose a subset of hydraulic conductivity data. The expression is evaluated in the environment generated by model.frame(hcc_formula, data). Defaults to subset.

weights

an optional expression generating a numeric vector of case weights w^\prime_{\theta,i} and w^\prime_{K,i} (default: 1, see soilhypfitIntro) for water content and hydraulic conductivity data. The expression is evaluated in the environment generated by model.frame(wrc_formula, data) and model.frame(hcc_formula, data), respectively.

wrc_weights

an optional expression generating a numeric vector of case weights w^\prime_{\theta,i} (see soilhypfitIntro) for water content data. The expression is evaluated in the environment generated by model.frame(wrc_formula, data). Defaults to weights.

hcc_weights

an optional expression generating a numeric vector of case weights w^\prime_{K,i} (see soilhypfitIntro) for hydraulic conductivity data. The expression is evaluated in the environment generated by model.frame(hcc_formula, data). Defaults to weights.

na.action

a function which indicates what should happen when the data contain NAs. The default is set by the na.action argument of options, and is na.fail if that is unset. The “factory-fresh” default is na.omit. Another possible value is NULL, no action. Value na.exclude can be useful.

param

an optional named numeric vector (or a numeric matrix or a dataframe with specified row and column names, see Details) with initial values for the model parameters. Currently, param may have elements (or columns) named "alpha", "n", "tau", "thetar", "thetas", "k0", see wc_model and hc_model and Details. For consistency with other quantities, the unit of \alpha should be 1/meter [\mathrm{m}^{-1}] and the unit of K_0 meter/day [\mathrm{m}\,\mathrm{d}^{-1}].

lower_param

an optional named numeric vector (or a numeric matrix or a dataframe with specified row and column names, see Details) with lower bounds for the parameters of the models. Currently, lower_param may have elements (or columns) named "alpha", "n", "tau", "thetar", "thetas", "k0", see wc_model, hc_model and param_boundf. For consistency with other quantities, the unit of \alpha should be 1/meter [\mathrm{m}^{-1}] and the unit of K_0 meter/day [\mathrm{m}\,\mathrm{d}^{-1}]. If lower bounds are specified for \theta_r but not for \theta_s then the lower bounds specified for \theta_r will also be used for \theta_s.

upper_param

an optional named numeric vector (or a numeric matrix or a dataframe with specified row and column names, see Details) with upper bounds for the parameters of the models. Currently, upper_param may have elements (or columns) named "alpha", "n", "tau", "thetar", "thetas", "k0", see wc_model, hc_model and param_boundf. For consistency with other quantities, the unit of \alpha should be 1/meter [\mathrm{m}^{-1}] and the unit of K_0 meter/day [\mathrm{m}\,\mathrm{d}^{-1}]. If upper bounds are specified for \theta_s but not for \theta_r then the upper bounds specified for \theta_s will also be used for \theta_r.

fit_param

a named logical vector (or a logical matrix or a dataframe with specified row and column names, see Details) containing flags that control whether model parameters are estimated (TRUE) or kept fixed at the initial values (FALSE) as specified in param. This vector can be generated easily by the function default_fit_param. Currently, fit_param may have elements (or columns) named "alpha", "n", "tau",
"thetar", "thetas", "k0", see Details, wc_model and hc_model.

e0

a numeric scalar (or named vector, see Details) with the stage-I rate of evaporation E_0 from a soil (default 2.5\cdot 10^{-3} \ \mathrm{m}\,\mathrm{d}^{-1}) required to evaluate the characteristic evaporative length, see evaporative-length and soilhypfitIntro. For consistency with other quantities, the unit of E_0 should meter/day [\mathrm{m}\,\mathrm{d}^{-1}]. Note that e0 is ignored when an unconstrained nonlinear optimisation algorithm is chosen (argument settings of control_fit_wrc_hcc equal to "uglobal", "sce" or "ulocal").

ratio_lc_lt_bound

a named numeric vector of length 2 (or a matrix with specified rownames and two columns, see Details) defining the default lower and upper bounds of the ratio L_\mathrm{c}/L_\mathrm{t} (Lehmann et al., 2008, 2020) for constrained estimation of the nonlinear parameters \boldsymbol{\nu}^\mathrm{T} =(\alpha, n, \tau) (default values 0.5 (lower) and 2 (upper)), see evaporative-length and soilhypfitIntro. Note that ratio_lc_lt_bound is ignored when an unconstrained nonlinear optimisation algorithm is chosen (argument settings of control_fit_wrc_hcc equal to "uglobal", "sce" or "ulocal").

control

a list with options to control fit_wrc_hcc or a function such as
control_fit_wrc_hcc that generates such a list. The main argument settings of control_fit_wrc_hcc selects a nonlinear optimisation approach, see
soilhypfitIntro and control_fit_wrc_hcc for full details.

verbose

positive integer controlling logging of diagnostic messages to the console and plotting of data and model curves during fitting.

verbose < 0

suppresses all output,

verbose >= 0

suppresses all output except warning messages,

verbose >= 1

displays at the end the measurements along with the fitted model curves,

verbose >= 2

prints for each iteration the parameters values, the value of the objective function and the ratio of L_\mathrm{c}/L_\mathrm{t} (see evaporative-length),

verbose >= 3

plots for each iteration the measurements along with the fitted model curves.

Logging of further diagnostics during fitting is controlled in addition by the arguments print_level, check_derivatives, check_derivatives_print when nloptr is used (see nloptr.print.options and control_nloptr) and by the argument trace of SCEoptim (see control_sce).

alpha

a logical scalar controlling whether the inverse air entry pressure \alpha should be estimated (TRUE, default) or kept fixed at the initial value (FALSE) specified by param, see wc_model and hc_model.

n

a logical scalar controlling whether the shape parameter n should be estimated (TRUE, default) or kept fixed at the initial value (FALSE) specified by param, see wc_model and hc_model.

tau

a logical scalar controlling whether the tortuosity parameter \tau should be estimated (TRUE) or kept fixed at the initial value (FALSE, default) specified by param, see hc_model.

thetar

a logical scalar controlling whether the residual water content \theta_r should be estimated (TRUE, default) or kept fixed at the initial value (FALSE) specified by param, see wc_model.

thetas

a logical scalar controlling whether the saturated water content \theta_s should be estimated (TRUE, default) or kept fixed at the initial value (FALSE) specified by param, see wc_model.

k0

a logical scalar controlling whether the saturated hydraulic conductivity K_0 should be estimated (TRUE, default) or kept fixed at the initial value (FALSE) specified by param, see hc_model.

Details

Estimating model parameters of water retention curves and/or hydraulic conductivity functions

The function fit_wrc_hcc estimates model parameters of water retention curves and/or hydraulic conductivity functions by maximum likelihood (ml, default) maximum posterior density (mpd) or nonlinear least squares (wls), see argument method of control_fit_wrc_hcc. Measurements must be available for at least one of the two material functions. If one type of data is missing then the respective formula, subset and weights arguments should be omitted.

If both types of data are available then measurements are weighed when computing the residual sum of squares for wls, but unit weights are used by default for mpd and ml estimation, see soilhypfitIntro. The wls weights are the product of the case weights w^\prime_{\theta,i} and w^\prime_{K,i} given by weights, wrc_weights and hcc_weights, respectively, and the variable weights as specified by the argument variable_weight of control_fit_wrc_hcc. Note that the variable_weights are not used “as is”, but they are multiplied by the inverse variances of the water content or the (log-transformed) conductivity data per sample to obtain the variable weights w_\theta, w_K mentioned in soilhypfitIntro.

Estimating model parameters for a single or multiple soil samples

When parameters are estimated for a single soil sample only, then model formulae are of the form wc ~ head and/or hc ~head, and there is no need to specify a sample id. In this case param, lower_param, upper_param, fit_param and ratio_lc_lt_bound are vectors and e0 is a scalar.

fit_wrc_hcc allows to fit models to datasets containing data for multiple soil samples. The model formula must then be of the form wc ~ head|id and/or hc ~ head|id where the factor id identifies the soil samples. If param, lower_param, upper_param, fit_param and ratio_lc_lt_bound are vectors and e0 is a scalar then this information is used for processing all the soil samples. If specific information per sample should be used then param, lower_param, upper_param, fit_param and ratio_lc_lt_bound must be matrices (or data frames) with as many rows as there are soil samples. The matrices (or data.frames) must have rownames matching the levels of the factor id defining the soil samples. e0 must be a named vector with as many elements as there are soil samples and the names must again match the levels of id.

By default, fit_wrc_hcc processes data of multiple soil samples in parallel, see control_pcmp for options to control parallel computing. Note that diagnostic output (except warning messages) is suppressed in parallel computations. If computations fail for a particular soil sample, the id of the sample with the failed fit can be extracted by the utility function select_failed_fits and the respective error messages by extract_error_messages.

Controlling fit_wrc_hcc

The argument control is used to pass a list of options to fit_wrc_hcc that steer the function, see soilhypfit-package and control_fit_wrc_hcc for full details.

Value

fit_wrc_hcc returns an object of class fit_wrc_hcc, which is a list with the following components:

fit

a list with as many components as there are soils samples. Each component is in turn a list that contains the estimated parameters and accompanying information:

  • converged: an integer code issued by SCEoptim or nloptr on (non-) convergence, see convergence_message and NLopt return values.

  • message: a character string issued by SCEoptim or nloptr on (non-)convergence.

  • evaluation: the number of evaluations of the objective function and of the gradient.

  • objective: the value of the objective function evaluated at the solution. The contributions of the water retention curve and the hydraulic conductivity function to objective are reported as attributes obj_wc and obj_hc. The attributes ssq_wc and ssq_hc are the respective residual sums of squares Q_\theta and Q_K, see soilhypfitIntro.

  • gradient: the gradient of the objective function at the solution with respect to the (possibly transformed) nonlinear parameters.

  • lp: the estimated values for the linear parameters \boldsymbol{\mu}^\mathrm{T} =(\theta_r, \theta_s, K_0), see wc_model and hc_model.

  • nlp: the estimated values for the nonlinear parameters \boldsymbol{\nu}^\mathrm{T} =(\alpha, n, \tau), see wc_model and hc_model.

  • inequality_constraint: a list with the values of the inequality constraints evaluated at the solutions. Currently, values are reported for the expressions.

    -(\frac{L_\mathrm{c}}{L_\mathrm{t}} - l)

    \frac{L_\mathrm{c}}{L_\mathrm{t}} - u

    in the only component lc, where l and u are the lower and upper bounds of the ratio, see argument ratio_lc_lt_bound and evaporative-length. The values of L_\mathrm{c}, L_\mathrm{t}, the imposed bounds on their ratio and e0 are returned as attributes of lc.

  • hessian: optionally the Hessian matrix of the objective function with respect to the possibly transformed nonlinear parameters at the solution.

  • variable_weight: a named vector with the variable weights w_{\theta} and w_{K}, see Details.

  • weight: a list with one or two components named weights_wc and or weights_hc with the case weights w_{\theta,i} and w_{K,i} used in the objective function, see Details.

  • fitted: a list with one or two components named wrc and/or hcc with the fitted values for the water retention and the (possibly log-transformed) hydraulic conductivity data.

  • residuals: a list with one or two components named wrc and/or hcc with residuals for the water retention and the (possibly log-transformed) hydraulic conductivity data.

  • model: a list with one or two components named wrc and/or hcc with the model.frames for the water retention and hydraulic conductivity data.

  • initial_objects: a list with the values for param, fit_param, variable_weight, param_bound, e0 and ratio_lc_lt_bound as taken from the (processed) arguments of the call of fit_wrc_hcc.

sample_id_variable

the name of the variable defining the samples.

wrc

logical scalar signalling whether water retention data were used to estimate the parameters.

wrc_formula

formula defining the variables for water content and hydraulic head of the water retention data (NULL if not wrc equal to FALSE).

wrc_mf

unevaluated call of model.frame to build the modelframe for the water retention data (NULL if wrc equal to FALSE).

wrc

logical scalar signalling whether water retention data were used to estimate the parameters.

hcc_formula

formula defining the variables for hydraulic conductivity and hydraulic head of the hydraulic conductivity data (NULL if not hcc equal to FALSE).

hcc_mf

unevaluated call of model.frame to build the modelframe for the water retention data (NULL if hcc equal to FALSE).

control

a list with the options used to control fit_wrc_hcc, see control_fit_wrc_hcc.

call

the matched call.

na.action

information returned by model.frame on the special handling of NAs.

Author(s)

Andreas Papritz papritz@retired.ethz.ch.

References

Duan, Q., Sorooshian, S., and Gupta, V. K. (1994) Optimal use of the SCE-UA global optimisation method for calibrating watershed models, Journal of Hydrology 158, 265–284, doi:10.1016/0022-1694(94)90057-4.

Johnson, S.G. The NLopt nonlinear-optimisation package. https://github.com/stevengj/nlopt.

Lehmann, P., Assouline, S., Or, D. (2008) Characteristic lengths affecting evaporative drying of porous media. Physical Review E, 77, 056309, doi:10.1103/PhysRevE.77.056309.

Lehmann, P., Bickel, S., Wei, Z., Or, D. (2020) Physical Constraints for Improved Soil Hydraulic Parameter Estimation by Pedotransfer Functions. Water Resources Research 56, e2019WR025963, doi:10.1029/2019WR025963.

See Also

soilhypfitIntro for a description of the models and a brief summary of the parameter estimation approach;

control_fit_wrc_hcc for options to control fit_wrc_hcc;

soilhypfitmethods for common S3 methods for class fit_wrc_hcc;

vcov for computing (co-)variances of the estimated nonlinear parameters;

prfloglik_sample for profile loglikelihood computations;

wc_model and hc_model for currently implemented models for soil water retention curves and hydraulic conductivity functions;

evaporative-length for physically constraining parameter estimates of soil hydraulic material functions.

Examples


# use of \donttest{} because execution time exceeds 5 seconds

data(sim_wrc_hcc)

# define number of cores for parallel computations
if(interactive()) ncpu <- parallel::detectCores() - 1L else ncpu <- 1L


# estimate parameters for single sample ...

#  ... from wrc and hcc data
plot(rfit_wrc_hcc <- fit_wrc_hcc(
  wrc_formula = wc ~ head, hcc_formula = hc ~ head,
  data = sim_wrc_hcc, subset = id == 1))
coef(rfit_wrc_hcc, gof = TRUE)

#  ... from wrc data
plot(rfit_wrc <- fit_wrc_hcc(
  wrc_formula = wc ~ head,
  data = sim_wrc_hcc, subset = id == 1))
coef(rfit_wrc, gof = TRUE)

#  ... from  hcc data
plot(rfit_hcc <- fit_wrc_hcc(
  hcc_formula = hc ~ head,
  data = sim_wrc_hcc, subset = id == 1))
coef(rfit_hcc, gof = TRUE)

# ... from wrc and hcc data
#     keeping some parameters fixed
plot(rfit_wrc_hcc_fixed <- fit_wrc_hcc(
  wrc_formula = wc ~ head, hcc_formula = hc ~ head,
  data = sim_wrc_hcc, subset = id == 1,
  param = c(alpha = 2.1, thetar = 0.1),
  fit_param = default_fit_param(alpha = FALSE, thetar = FALSE)),
  y = rfit_wrc_hcc)
coef(rfit_wrc_hcc, gof = TRUE)
coef(rfit_wrc_hcc_fixed, gof = TRUE)


# estimate parameters for 3 samples simultaneously by ...

# ... unconstrained, global optimisation algorithm NLOPT_GN_MLSL (default)
rfit_uglob <- fit_wrc_hcc(
  wrc_formula = wc ~ head | id, hcc_formula = hc ~ head | id,
  data = sim_wrc_hcc,
  control = control_fit_wrc_hcc(pcmp = control_pcmp(ncores = ncpu)))
summary(rfit_uglob)
op <- par(mfrow = c(3, 2))
plot(rfit_uglob)
on.exit(par(op))

# ... unconstrained, global optimisation algorithm SCEoptim
rfit_sce <- update(
  rfit_uglob,
  control = control_fit_wrc_hcc(
    settings = "sce", pcmp = control_pcmp(ncores = ncpu)))
coef(rfit_sce, gof = TRUE, lc = TRUE)
convergence_message(2, sce = TRUE)
op <- par(mfrow = c(3, 2))
plot(rfit_sce, y = rfit_uglob)
on.exit(par(op))

# ... unconstrained, local optimisation algorithm NLOPT_LN_BOBYQA,
#     logging iteration results to console
rfit_uloc <- update(
  rfit_uglob,
  param = as.matrix(coef(rfit_uglob, what = "nonlinear")),
  control = control_fit_wrc_hcc(
    settings = "ulocal", pcmp = control_pcmp(ncores = 1L)),
  verbose = 2)
coef(rfit_uloc, gof = TRUE, lc = TRUE)

# ... constrained, global optimisation algorithm NLOPT_GN_ISRES
rfit_cglob <- update(
  rfit_uglob,
  ratio_lc_lt_bound = c(lower = 0.8, upper = 1.2),
  control = control_fit_wrc_hcc(
    settings = "cglobal", nloptr = control_nloptr(ranseed = 1),
    pcmp = control_pcmp(ncores = ncpu)))
coef(rfit_cglob, gof = TRUE, lc = TRUE)

# ... constrained, local optimisation algorithm NLOPT_LD_CCSAQ
#     starting from unconstrained, locally fitted initial values
rfit_cloc_1 <- update(
  rfit_uglob,
  param = coef(rfit_uloc, what = "nonlinear"),
  ratio_lc_lt_bound = c(lower = 0.8, upper = 1.2),
  control = control_fit_wrc_hcc(
    settings = "clocal", pcmp = control_pcmp(ncores = ncpu)))
coef(rfit_cloc_1, gof = TRUE, lc = TRUE)
op <- par(mfrow = c(3, 2))
plot(x = rfit_uloc, y = rfit_cloc_1)
on.exit(par(op))

# ... constrained, local optimisation algorithm NLOPT_LD_CCSAQ
#     starting from constrained, globally fitted initial values,
#     using sample-specific evaporation rates and limits for ratio lc/lt
rfit_cloc_2 <- update(
  rfit_uglob,
  param = as.matrix(coef(rfit_cglob, what = "nonlinear")),
  e0 = c("1" = 0.002, "2" = 0.0025, "3" = 0.003),
  ratio_lc_lt_bound = rbind(
    "1" = c(lower = 0.7, upper = 2),
    "2" = c(lower = 0.8, upper = 1.4),
    "3" = c(lower = 0.8, upper = 2)
  ),
  control = control_fit_wrc_hcc(
    settings = "clocal", pcmp = control_pcmp(ncores = ncpu)))
coef(rfit_cloc_2, gof = TRUE, lc = TRUE, e0 = TRUE)

# ... global optimisation algorithm NLOPT_GN_MLSL
#     with sample-specific box-constraints

rfit_uglob_bc <- update(
  rfit_uglob,
  lower_param = rbind(
    "1" = c(alpha = 2.4, n = 1.11, thetar = 0.2, thetas = 0.6),
    "2" = c(alpha = 1.2, n = 1.12, thetar = 0.2, thetas = 0.6),
    "3" = c(alpha = 1.2, n = 1.13, thetar = 0.2, thetas = 0.6)
  ),
  upper_param = rbind(
    "1" = c(alpha = 20.1, n = 2.51, thetar = 0.6, thetas = 0.6),
    "2" = c(alpha = 20.2, n = 1.5,  thetar = 0.6, thetas = 0.6),
    "3" = c(alpha = 1.3, n = 2.53,  thetar = 0.6, thetas = 0.6)
  )
)
coef(rfit_uglob, gof = TRUE)
coef(rfit_uglob_bc, gof = TRUE)


[Package soilhypfit version 0.1-7 Index]