soilhypfit-package {soilhypfit}R Documentation

The soilhypfit Package

Description

This is a summary of the features and functionality of soilhypfit, a package in R for parameteric modelling of soil water retention and hydraulic conductivity data.

Details

Estimation approach

The main function, fit_wrc_hcc, estimates parameters of models for soil water retention and hydraulic conductivity by maximum likelihood (ml, default), maximum posterior density (mpd) estimation (Stewart et al., 1992) or nonlinear weighted least squares (wls) from data on volumetric soil water content, θT=(θ1,θ2,...,θnθ)\boldsymbol{\theta}^\mathrm{T} =(\theta_1, \theta_2, ..., \theta_{n_\theta}), and/or soil hydraulic conductivity, KT=(K1,K2,...,KnK)\boldsymbol{K}^\mathrm{T} =(K_1, K_2, ..., K_{n_K}), both measured at given capillary pressure head, hT=(h1,h2,...)\boldsymbol{h}^\mathrm{T} =(h_1, h_2, ...).

For mpd and ml estimation, the models for the measurements are

θi=θ(hi;μ,ν)+εθ,i,  i=1,2,...,nθ, \theta_i = \theta(h_i;\boldsymbol{\mu}, \boldsymbol{\nu}) + \varepsilon_{\theta,i}, \ \ i=1, 2, ..., n_\theta,

log(Kj)=log(K(hj;μ,ν))+εK,j,  j=1,2,...,nK, \log(K_j) = \log(K(h_j;\boldsymbol{\mu}, \boldsymbol{\nu})) + \varepsilon_{K,j}, \ \ j=1, 2, ..., n_K,

where θ(hi;μ,ν) \theta(h_i; \boldsymbol{\mu}, \boldsymbol{\nu}) and K(hj;μ,ν) K(h_j; \boldsymbol{\mu}, \boldsymbol{\nu}) denote modelled water content and conductivity, μ\boldsymbol{\mu}^\mathrm{} and ν\boldsymbol{\nu}^\mathrm{} are the conditionally linear and nonlinear model parameters (see below and Bates and Watts, 1988, sec. 3.3.5), and εθ,i\varepsilon_{\theta,i} and εK,j\varepsilon_{K,j} are independent, normally distributed errors with zero means and variances σθ2\sigma^2_\theta and σK2\sigma^2_K, respectively.

Let

Qθ(μ,ν;θ,h)=(θθ(h;μ,ν))TWθ(θθ(h;μ,ν)), Q_{\theta}(\boldsymbol{\mu}, \boldsymbol{\nu}; \boldsymbol{\theta}, \boldsymbol{h}) =\left( \boldsymbol{\theta} - \boldsymbol{\theta}( \boldsymbol{h}; \boldsymbol{\mu}, \boldsymbol{\nu} ) \right)^\mathrm{T} \boldsymbol{W}_\theta \left( \boldsymbol{\theta} - \boldsymbol{\theta}( \boldsymbol{h}; \boldsymbol{\mu}, \boldsymbol{\nu} ) \right),

and

QK(μ,ν;K,h)=(log(K)log(K(h;μ,ν)))TWK(log(K)log(K(h;μ,ν))), Q_{K}(\boldsymbol{\mu}, \boldsymbol{\nu}; \boldsymbol{K}, \boldsymbol{h}) = \left( \log(\boldsymbol{K}) - \log(\boldsymbol{K}( \boldsymbol{h}; \boldsymbol{\mu}, \boldsymbol{\nu} )) \right)^\mathrm{T} \boldsymbol{W}_K \left( \log(\boldsymbol{K}) - \log(\boldsymbol{K}( \boldsymbol{h}; \boldsymbol{\mu}, \boldsymbol{\nu} )) \right),

denote the (possibly weighted) residual sums of squares between measurements and modelled values. θ(h;μ,ν))\boldsymbol{\theta}( \boldsymbol{h}; \boldsymbol{\mu}, \boldsymbol{\nu} ) ) and K(h;μ,ν)\boldsymbol{K}( \boldsymbol{h}; \boldsymbol{\mu}, \boldsymbol{\nu} ) are vectors with modelled values of water content and hydraulic conductivity, and Wθ\boldsymbol{W}_\theta and WK\boldsymbol{W}_K are optional diagonal matrices with weights wθ,iw_{\theta,i} and wK,jw_{K,j}, respectively. The weights are products of case weights wθ,iw^\prime_{\theta,i} and wK,jw^\prime_{K,j} and variable weights wθw_\theta, wKw_K, hence wθ,i=wθwθ,i w_{\theta,i} = w_\theta\, w^\prime_{\theta,i} and wK,j=wKwK,j w_{K,j} = w_K\, w^\prime_{K,j} .

The objective function for ml and mpd estimation is equal to (Stewart et al., 1992, eqs 14 and 15, respectively)

Q(μ,ν;θ,K,h)=κθ2log(Qθ(μ,ν;θ,h))+κK2log(QK(μ,ν;K,h)), Q(\boldsymbol{\mu}, \boldsymbol{\nu}; \boldsymbol{\theta}, \boldsymbol{K}, \boldsymbol{h} ) = \frac{\kappa_\theta}{2} \log( Q_{\theta}(\boldsymbol{\mu}, \boldsymbol{\nu}; \boldsymbol{\theta}, \boldsymbol{h})) + \frac{\kappa_K}{2} \log( Q_{K}(\boldsymbol{\mu}, \boldsymbol{\nu}; \boldsymbol{K}, \boldsymbol{h})),

where κv=nv+2\kappa_v = n_v + 2 for mpd and κv=nv\kappa_v = n_v for ml estimation, v(θ,K)v \in (\theta, K), and weights wθ,i=wK,j=1w_{\theta,i} = w_{K,j} = 1. Note that Q(μ,ν;θ,K,h) Q(\boldsymbol{\mu}, \boldsymbol{\nu}; \boldsymbol{\theta}, \boldsymbol{K}, \boldsymbol{h}) is equal to the negative logarithm of the concentrated loglikelihood or the concentrated posterior density, obtained by replacing σθ2\sigma^2_\theta and σK2\sigma^2_K by their conditional maximum likelihood or maximum density estimates σ^θ2(μ),ν) \widehat{\sigma}^2_\theta(\boldsymbol{\mu}), \boldsymbol{\nu}) and σ^K2(μ) \widehat{\sigma}^2_K(\boldsymbol{\mu}) respectively (Stewart et al., 1992, p. 642).

For wls the objective function is equal to

Q(μ,ν;θ,K,h)=Qθ(μ,ν;θ,h)+QK(μ,ν;K,h). Q(\boldsymbol{\mu}, \boldsymbol{\nu}; \boldsymbol{\theta}, \boldsymbol{K}, \boldsymbol{h} ) = Q_{\theta}(\boldsymbol{\mu}, \boldsymbol{\nu}; \boldsymbol{\theta}, \boldsymbol{h}) + Q_{K}(\boldsymbol{\mu}, \boldsymbol{\nu}; \boldsymbol{K}, \boldsymbol{h}).

If either water content or conductivity data are not available, then the respective terms are omitted from Q(μ,ν;θ,K,h)Q(\boldsymbol{\mu}, \boldsymbol{\nu}; \boldsymbol{\theta}, \boldsymbol{K}, \boldsymbol{h} ) .

The function fit_wrc_hcc does not attempt to estimate the parameters by minimising
Q(μ,ν;θ,K,h) Q(\boldsymbol{\mu}, \boldsymbol{\nu}; \boldsymbol{\theta}, \boldsymbol{K}, \boldsymbol{h}) directly with respect to μ\boldsymbol{\mu}^\mathrm{} and ν\boldsymbol{\nu}^\mathrm{} . Rather, it exploits the fact that for given nonlinear parameters ν\boldsymbol{\nu}^\mathrm{} , the conditionally linear parameters μT=(θr,θs,log(K0)) \boldsymbol{\mu}^T = (\theta_r, \theta_s, \log(K_0)) can be estimated straightforwardly by minimising the conditional residual sums of squares

Qθ(θr,θs;θ,h,ν)=(θ[1,S(h;ν)][θrθsθr])TWθ(θ[1,S(h;ν)][θrθsθr]) Q_\theta^*(\theta_r, \theta_s; \boldsymbol{\theta}, \boldsymbol{h}, \boldsymbol{\nu}) =\left( \boldsymbol{\theta} - [ \boldsymbol{1}, \boldsymbol{S}( \boldsymbol{h}; \boldsymbol{\nu} )] \, \left[ \begin{array}{c} \theta_r \\ \theta_s - \theta_r \end{array} \right] \right)^\mathrm{T} \boldsymbol{W}_\theta \left( \boldsymbol{\theta} - [ \boldsymbol{1}, \boldsymbol{S}( \boldsymbol{h}; \boldsymbol{\nu} )] \, \left[ \begin{array}{c} \theta_r \\ \theta_s - \theta_r \end{array} \right] \right)

with respect to θr\theta_r and θsθr\theta_s-\theta_r and/or

QK(K0;K,h,ν)=(log(K)log(K0k(h;ν)))TWK(log(K)log(K0k(h;ν))), Q_K^*(K_0; \boldsymbol{K}, \boldsymbol{h}, \boldsymbol{\nu})= \left( \log(\boldsymbol{K}) - \log(K_0 \, \boldsymbol{k}( \boldsymbol{h}; \boldsymbol{\nu} )) \right)^\mathrm{T} \boldsymbol{W}_K \left( \log(\boldsymbol{K}) - \log(K_0 \, \boldsymbol{k}( \boldsymbol{h}; \boldsymbol{\nu} )) \right),

with respect to log(K0)\log(K_0), where 1\boldsymbol{1}^\mathrm{} is a vector of ones, S(h;ν)T=(S(h1;ν),...,S(hnθ;ν))\boldsymbol{S}( \boldsymbol{h}; \boldsymbol{\nu} )^\mathrm{T} = ( S(h_1; \boldsymbol{\nu}), ..., S(h_{n_\theta}; \boldsymbol{\nu}) ) and k(h;ν)T=(k(h1;ν),...,k(hnK;ν))\boldsymbol{k}( \boldsymbol{h}; \boldsymbol{\nu} )^\mathrm{T} = ( k(h_1; \boldsymbol{\nu}), ..., k(h_{n_K}; \boldsymbol{\nu}) ) are vectors of modelled water saturation and modelled relative conductivity values, θr\theta_r and θs\theta_s are the residual and saturated water content, and K0K_0 is the saturated hydraulic conductivity.

Unconstrained conditional estimates, say θ^r(ν) \widehat{\theta}_r(\boldsymbol{\nu}) , θ^s(ν)θ^r(ν) \widehat{\theta}_s(\boldsymbol{\nu}) - \widehat{\theta}_r(\boldsymbol{\nu}) and log(K0)^(ν) \widehat{\log(K_0)}(\boldsymbol{\nu}) can be easily obtained from the normal equations of the respective (weighted) least squares problems, and quadratic programming yields conditional (weighted) least squares estimates that honour the inequality constraints 0θrθs10 \le \theta_r \le \theta_s \le 1.

Let μ^(ν)T=(θ^r(ν),θ^s(ν),log(K0)^(ν)) \widehat{\boldsymbol{\mu}}(\boldsymbol{\nu})^\mathrm{T} = ( \widehat{\theta}_r(\boldsymbol{\nu}), \widehat{\theta}_s(\boldsymbol{\nu}), \widehat{\log(K_0)}(\boldsymbol{\nu}) ) be the conditional estimates of the linear parameters obtained by minimising Qθ(θr,θs;θ,h,ν) Q_\theta^*(\theta_r, \theta_s; \boldsymbol{\theta}, \boldsymbol{h}, \boldsymbol{\nu}) , and QK(K0;K,h,ν) Q_K^*(K_0; \boldsymbol{K}, \boldsymbol{h}, \boldsymbol{\nu}) , respectively. fit_wrc_hcc then estimates the nonlinear parameters by minimising Q(μ^(ν),ν;θ,K,h) Q( \widehat{\boldsymbol{\mu}}(\boldsymbol{\nu}), \boldsymbol{\nu}; \boldsymbol{\theta}, \boldsymbol{K}, \boldsymbol{h}) with respect to ν\boldsymbol{\nu}^\mathrm{} by a nonlinear optimisation algorithm.

For mpd and ml estimation the variances of the model errors are estimated by (Stewart et al., 1992, eq. 16)

σ^θ2=Qθ(μ^(ν^),ν^;θ,h)κθ, \widehat{\sigma}_\theta^2 = \frac{Q_{\theta}( \widehat{\boldsymbol{\mu}}(\widehat{\boldsymbol{\nu}}), \widehat{\boldsymbol{\nu}}; \boldsymbol{\theta}, \boldsymbol{h})}{\kappa_\theta},

and

σ^K2=QK(μ^(ν^),ν^;K,h)κK. \widehat{\sigma}_K^2 = \frac{Q_{K}( \widehat{\boldsymbol{\mu}}(\widehat{\boldsymbol{\nu}}), \widehat{\boldsymbol{\nu}}; \boldsymbol{K}, \boldsymbol{h})}{\kappa_K}.

Furthermore, for mpd and ml estimation, the covariance matrix of the estimated nonlinear parameters may be approximated by the inverse Hessian matrix of Q(μ^(ν),ν;θ,K,h) Q(\widehat{\boldsymbol{\mu}}(\boldsymbol{\nu}), \boldsymbol{\nu}; \boldsymbol{\theta}, \boldsymbol{K}, \boldsymbol{h}) at the solution ν^ \widehat{\boldsymbol{\nu}} (Stewart and Sørensen, 1981), i.e.

Cov[ν^,ν^T]A1, \mathrm{Cov}[ \widehat{\boldsymbol{\nu}}, \widehat{\boldsymbol{\nu}}^T] \approx \boldsymbol{A}^{-1},

where

[A]kl=2νkνlQ(μ^(ν),ν;θ,K,h)ν=ν^. [\boldsymbol{A}]_{kl} = \frac{\partial^2}{\partial\nu_k\, \partial\nu_l} \left. Q(\widehat{\boldsymbol{\mu}}(\boldsymbol{\nu}), \boldsymbol{\nu}; \boldsymbol{\theta}, \boldsymbol{K}, \boldsymbol{h})\right|_{\nu=\hat{\nu}}.

Details on parameter estimation

Models for water retention curves and hydraulic conductivity functions

Currently, fit_wrc_hcc allows to estimate the parameters of the simplified form of the Van Genuchten-Mualem (VGM) model (Van Genuchten, 1980) with the restriction m=11nm = 1 - \frac{1}{n}, see wc_model and hc_model. This model has the following parameters:

Any of these parameters can be either estimated from data or kept fixed at the specified initial values (see arguments param and fit_param of fit_wrc_hcc).

Imposing physical constraints on the estimated parameters

Parameters of models for the water retention curve and the hydraulic conductivity function may vary only within certain bounds (see wc_model, hc_model and param_boundf for allowed ranges). fit_wrc_hcc either estimates transformed parameters that vary over the whole real line and can therefore be estimated without constraints (see param_transf), or it uses algorithms (quadratic programming for estimating μ\boldsymbol{\mu}^\mathrm{} , nonlinear optimisation algorithms with box constraints for estimating ν\boldsymbol{\nu}^\mathrm{} ) that restrict estimates to permissible ranges, see Details section of control_fit_wrc_hcc.

In addition, for natural soils, the parameters of the VGM model cannot vary independently from each other over the allowed ranges. Sets of fitted parameters should always result in soil hydraulic quantities that are physically meaningful. One of these quantities is the characteristic length LcL_\mathrm{c} of stage-I evaporation from a soil (Lehmann et al., 2008). LcL_\mathrm{c} can be related to the parameters of the VGM model, see Lehmann et al. (2008, 2020) and evaporative-length.

Using several soil hydrological databases, Lehmann et al. (2020) analysed the mutual dependence of VGM parameters and proposed regression equations to relate the inverse air entry pressure α\alpha and the saturated hydraulic K0K_0 to the shape parameter nn, which characterises the width of the pore size distribution of a soil. Using these relations, Lehmann et al. (2020) then computed the expected value (“target”) LtL_\mathrm{t} of LcL_\mathrm{c} for given nn and tortuosity parameter τ\tau, see evaporative-length. fit_wrc_hcc allows to constrain estimates of the nonlinear parameters ν\boldsymbol{\nu}^\mathrm{} by defining permissible lower and upper bounds for the ratio Lc/LtL_\mathrm{c}/L_\mathrm{t}, see arguments ratio_lc_lt_bound of fit_wrc_hcc and settings of control_fit_wrc_hcc.

Choice of optimisation algorithm for estimating the nonlinear parameters

To estimate ν\boldsymbol{\nu}^\mathrm{} , fit_wrc_hcc minimises Q(μ^(ν),ν;θ,K,h) Q( \widehat{\boldsymbol{\mu}}(\boldsymbol{\nu}), \boldsymbol{\nu}; \boldsymbol{\theta}, \boldsymbol{K}, \boldsymbol{h}) either by a nonlinear optimisation algorithm available in the library NLopt (Johnson, see nloptr) or by the Shuffled Complex Evolution (SCE) optimisation algorithm (Duan et al., 1994, see SCEoptim). The choice of the algorithm is controlled by the argument settings of the function control_fit_wrc_hcc:

  1. global optimisation without constraints for the ratio Lc/LtL_\mathrm{c}/L_\mathrm{t}
    (settings = "uglobal" or settings = "sce"),

  2. global optimisation with inequality constraints for the ratio Lc/LtL_\mathrm{c}/L_\mathrm{t}
    (settings = "cglobal"),

  3. local optimisation without constraints for the ratio Lc/LtL_\mathrm{c}/L_\mathrm{t}
    (settings = "ulocal"),

  4. local optimisation with inequality constraints for the ratio Lc/LtL_\mathrm{c}/L_\mathrm{t}
    (settings = "clocal").

The settings argument also sets reasonable default values for the termination (= convergence) criteria for the various algorithms, see NLopt documentation, section Termination conditions. The NLopt documentation contains a very useful discussion of (constrained) optimisation problems in general, global vs. local optimisation and gradient-based vs. derivative-free algorithms.

Note that besides the settings argument of control_fit_wrc_hcc, the arguments nloptr and sce along with the functions control_nloptr and control_sce allow to fully control the nonlinear optimisation algorithms, see control_fit_wrc_hcc for details.

Computing initial values of parameters

For local optimisation algorithms “good” initial values of ν\boldsymbol{\nu}^\mathrm{} are indispensable for successful estimation. fit_wrc_hcc allows to compute initial values of α\alpha and nn for the Van Genuchten model from water retention data by the following procedure:

  1. Smooth the water retention data, (θi,yi=log(hi)),i=1,2,...nθ, (\theta_i, y_i= \log(h_i)), i=1,2, ... n_\theta, by an additive model.

  2. Determine the saturation, SS^*, and the logarithm of capillary pressure head, y=log(h)y^* = \log(h^*), at the inflection point of the additive model fit.

  3. Find the root, say m^\widehat{m}, of S=(1+1/m)mS^* = (1 + 1/m)^{-m}. One obtains the right-hand side of this equation by solving 2y2[SVG(exp(y);ν)]=0\frac{\partial^2}{\partial y^2} \left[ S_\mathrm{VG}(\exp(y); \boldsymbol{\nu}) \right] = 0 for yy and plugging the result into the expression for SVG(exp(y);ν), S_\mathrm{VG}(\exp(y); \boldsymbol{\nu}), see wc_model.

  4. Compute n^=1/(1m^) \widehat{n} = 1 / (1 - \widehat{m}) and α^=1/exp(y)(1/m^)1m^. \widehat{\alpha} = 1 / \exp(y^*) \,(1/\widehat{m})^{1-\widehat{m}}. The second expression is again a result of solving 2y2[SVG(exp(y);ν)]=0.\frac{\partial^2}{\partial y^2} \left[ S_\mathrm{VG}(\exp(y); \boldsymbol{\nu}) \right] = 0.

Initial values for local optimisation algorithms can of course also be obtained by first estimating the parameters by a global algorithm. These estimates can be “refined” in a second step by a local unconstrained algorithm, possibly followed by a third call of fit_wrc_hcc to constrain the estimated parameters by the ratio Lc/LtL_\mathrm{c}/L_\mathrm{t}. The method coef.fit_wrc_hcc can be used to extract the estimated parameters from an object of class fit_wrc_hcc and to pass them as initial values to fit_wrc_hcc, see fit_wrc_hcc for examples.

Author(s)

Andreas Papritz papritz@retired.ethz.ch.

References

Bates, D. M., and Watts, D. G. (1988) Nonlinear Regression Analysis and its Applications. John Wiley & Sons, New York doi:10.1002/9780470316757.

Duan, Q., Sorooshian, S., and Gupta, V. K. (1994) Optimal use of the SCE-UA global optimisation method for calibrating watershed models, Journal of Hydrology 158, 265–284, doi:10.1016/0022-1694(94)90057-4.

Johnson, S.G. The NLopt nonlinear-optimisation package. https://github.com/stevengj/nlopt.

Lehmann, P., Assouline, S., Or, D. (2008) Characteristic lengths affecting evaporative drying of porous media. Physical Review E, 77, 056309, doi:10.1103/PhysRevE.77.056309.

Lehmann, P., Bickel, S., Wei, Z., Or, D. (2020) Physical Constraints for Improved Soil Hydraulic Parameter Estimation by Pedotransfer Functions. Water Resources Research 56, e2019WR025963, doi:10.1029/2019WR025963.

Stewart, W.E., Caracotsios, M. Sørensen, J.P. 1992. Parameter estimation from multiresponse data. AIChE Journal, 38, 641–650, doi:10.1002/aic.690380502.

Stewart, W.E. and Sørensen, J.P. (1981) Bayesian estimation of common parameters from multiresponse data with missing observations. Technometrics, 23, 131–141,
doi:10.1080/00401706.1981.10486255.

Van Genuchten, M. Th. (1980) A closed-form equation for predicting the hydraulic conductivity of unsaturated soils. Soil Science Society of America Journal, 44, 892–898, doi:10.2136/sssaj1980.03615995004400050002x.

See Also

fit_wrc_hcc for (constrained) estimation of parameters of models for soil water retention and hydraulic conductivity data;

control_fit_wrc_hcc for options to control fit_wrc_hcc;

soilhypfitmethods for common S3 methods for class fit_wrc_hcc;

vcov for computing (co-)variances of the estimated nonlinear parameters;

prfloglik_sample for profile loglikelihood computations;

wc_model and hc_model for currently implemented models for soil water retention curves and hydraulic conductivity functions;

evaporative-length for physically constraining parameter estimates of soil hydraulic material functions.


[Package soilhypfit version 0.1-7 Index]