soilhypfit-package {soilhypfit} | R Documentation |
The soilhypfit Package
Description
This is a summary of the features and functionality of soilhypfit, a package in R for parameteric modelling of soil water retention and hydraulic conductivity data.
Details
Estimation approach
The main function, fit_wrc_hcc
, estimates parameters of
models for soil water retention and hydraulic conductivity by
maximum likelihood (ml, default), maximum posterior
density (mpd) estimation (Stewart et al., 1992) or
nonlinear weighted least squares (wls) from data on volumetric
soil water content, \boldsymbol{\theta}^\mathrm{T} =(\theta_1, \theta_2,
..., \theta_{n_\theta})
, and/or soil hydraulic conductivity,
\boldsymbol{K}^\mathrm{T} =(K_1, K_2, ..., K_{n_K})
, both measured at given
capillary pressure head, \boldsymbol{h}^\mathrm{T} =(h_1, h_2, ...)
.
For mpd and ml estimation, the models for the measurements are
\theta_i = \theta(h_i;\boldsymbol{\mu}, \boldsymbol{\nu})
+ \varepsilon_{\theta,i}, \ \ i=1, 2, ..., n_\theta,
\log(K_j) = \log(K(h_j;\boldsymbol{\mu}, \boldsymbol{\nu}))
+ \varepsilon_{K,j}, \ \ j=1, 2, ..., n_K,
where
\theta(h_i; \boldsymbol{\mu}, \boldsymbol{\nu})
and
K(h_j; \boldsymbol{\mu}, \boldsymbol{\nu})
denote modelled water content and conductivity,
\boldsymbol{\mu}^\mathrm{}
and \boldsymbol{\nu}^\mathrm{}
are the
conditionally linear and nonlinear
model parameters (see below and Bates and Watts, 1988, sec. 3.3.5),
and
\varepsilon_{\theta,i}
and
\varepsilon_{K,j}
are independent,
normally distributed errors
with zero means and variances \sigma^2_\theta
and \sigma^2_K
, respectively.
Let
Q_{\theta}(\boldsymbol{\mu}, \boldsymbol{\nu};
\boldsymbol{\theta},
\boldsymbol{h}) =\left(
\boldsymbol{\theta} -
\boldsymbol{\theta}(
\boldsymbol{h};
\boldsymbol{\mu}, \boldsymbol{\nu}
)
\right)^\mathrm{T}
\boldsymbol{W}_\theta
\left(
\boldsymbol{\theta} -
\boldsymbol{\theta}(
\boldsymbol{h};
\boldsymbol{\mu}, \boldsymbol{\nu}
)
\right),
and
Q_{K}(\boldsymbol{\mu}, \boldsymbol{\nu};
\boldsymbol{K}, \boldsymbol{h}) = \left(
\log(\boldsymbol{K}) -
\log(\boldsymbol{K}(
\boldsymbol{h};
\boldsymbol{\mu}, \boldsymbol{\nu}
))
\right)^\mathrm{T}
\boldsymbol{W}_K
\left(
\log(\boldsymbol{K}) -
\log(\boldsymbol{K}(
\boldsymbol{h};
\boldsymbol{\mu}, \boldsymbol{\nu}
))
\right),
denote the (possibly weighted) residual sums of squares between
measurements and modelled values.
\boldsymbol{\theta}(
\boldsymbol{h};
\boldsymbol{\mu}, \boldsymbol{\nu}
) )
and
\boldsymbol{K}(
\boldsymbol{h};
\boldsymbol{\mu}, \boldsymbol{\nu}
)
are vectors with modelled values of water content and hydraulic
conductivity, and
\boldsymbol{W}_\theta
and
\boldsymbol{W}_K
are optional diagonal matrices
with weights w_{\theta,i}
and w_{K,j}
, respectively.
The weights are products of case weights w^\prime_{\theta,i}
and w^\prime_{K,j}
and variable weights w_\theta
, w_K
, hence
w_{\theta,i} = w_\theta\, w^\prime_{\theta,i}
and
w_{K,j} = w_K\, w^\prime_{K,j}
.
The objective function for ml and mpd estimation is equal to (Stewart et al., 1992, eqs 14 and 15, respectively)
Q(\boldsymbol{\mu}, \boldsymbol{\nu};
\boldsymbol{\theta}, \boldsymbol{K},
\boldsymbol{h}
) = \frac{\kappa_\theta}{2}
\log( Q_{\theta}(\boldsymbol{\mu}, \boldsymbol{\nu};
\boldsymbol{\theta}, \boldsymbol{h})) +
\frac{\kappa_K}{2}
\log( Q_{K}(\boldsymbol{\mu}, \boldsymbol{\nu};
\boldsymbol{K}, \boldsymbol{h})),
where \kappa_v = n_v + 2
for mpd and \kappa_v = n_v
for ml
estimation, v \in (\theta, K)
,
and weights w_{\theta,i} = w_{K,j} = 1
.
Note that
Q(\boldsymbol{\mu},
\boldsymbol{\nu}; \boldsymbol{\theta},
\boldsymbol{K}, \boldsymbol{h})
is equal to the negative logarithm of the
concentrated loglikelihood or the concentrated posterior density,
obtained by replacing
\sigma^2_\theta
and \sigma^2_K
by their conditional maximum
likelihood or maximum density estimates
\widehat{\sigma}^2_\theta(\boldsymbol{\mu}),
\boldsymbol{\nu})
and
\widehat{\sigma}^2_K(\boldsymbol{\mu})
respectively (Stewart et al., 1992, p. 642).
For wls the objective function is equal to
Q(\boldsymbol{\mu}, \boldsymbol{\nu};
\boldsymbol{\theta}, \boldsymbol{K},
\boldsymbol{h}
) = Q_{\theta}(\boldsymbol{\mu}, \boldsymbol{\nu};
\boldsymbol{\theta}, \boldsymbol{h}) +
Q_{K}(\boldsymbol{\mu}, \boldsymbol{\nu};
\boldsymbol{K}, \boldsymbol{h}).
If either water content or conductivity data are not available, then the respective
terms are omitted from
Q(\boldsymbol{\mu}, \boldsymbol{\nu};
\boldsymbol{\theta}, \boldsymbol{K},
\boldsymbol{h}
)
.
The function fit_wrc_hcc
does not attempt to estimate the parameters
by minimising
Q(\boldsymbol{\mu}, \boldsymbol{\nu};
\boldsymbol{\theta}, \boldsymbol{K},
\boldsymbol{h})
directly with respect to
\boldsymbol{\mu}^\mathrm{}
and \boldsymbol{\nu}^\mathrm{}
.
Rather, it exploits the fact that for given nonlinear parameters
\boldsymbol{\nu}^\mathrm{}
, the conditionally linear parameters
\boldsymbol{\mu}^T = (\theta_r, \theta_s, \log(K_0))
can be estimated straightforwardly
by minimising the conditional residual sums of squares
Q_\theta^*(\theta_r, \theta_s;
\boldsymbol{\theta},
\boldsymbol{h}, \boldsymbol{\nu})
=\left(
\boldsymbol{\theta} - [
\boldsymbol{1},
\boldsymbol{S}(
\boldsymbol{h};
\boldsymbol{\nu}
)] \, \left[
\begin{array}{c} \theta_r \\ \theta_s - \theta_r \end{array}
\right]
\right)^\mathrm{T}
\boldsymbol{W}_\theta
\left(
\boldsymbol{\theta} - [
\boldsymbol{1},
\boldsymbol{S}(
\boldsymbol{h};
\boldsymbol{\nu}
)] \, \left[
\begin{array}{c} \theta_r \\ \theta_s - \theta_r \end{array}
\right]
\right)
with respect to \theta_r
and \theta_s-\theta_r
and/or
Q_K^*(K_0; \boldsymbol{K},
\boldsymbol{h}, \boldsymbol{\nu})=
\left(
\log(\boldsymbol{K}) - \log(K_0 \,
\boldsymbol{k}(
\boldsymbol{h};
\boldsymbol{\nu}
))
\right)^\mathrm{T}
\boldsymbol{W}_K
\left(
\log(\boldsymbol{K}) - \log(K_0 \,
\boldsymbol{k}(
\boldsymbol{h};
\boldsymbol{\nu}
))
\right),
with respect to \log(K_0)
, where
\boldsymbol{1}^\mathrm{}
is a vector of ones,
\boldsymbol{S}(
\boldsymbol{h};
\boldsymbol{\nu}
)^\mathrm{T} = (
S(h_1; \boldsymbol{\nu}),
...,
S(h_{n_\theta}; \boldsymbol{\nu})
)
and
\boldsymbol{k}(
\boldsymbol{h};
\boldsymbol{\nu}
)^\mathrm{T} = (
k(h_1; \boldsymbol{\nu}),
...,
k(h_{n_K}; \boldsymbol{\nu})
)
are vectors of modelled water saturation
and modelled relative conductivity values,
\theta_r
and \theta_s
are the residual
and saturated water content, and
K_0
is the saturated hydraulic conductivity.
Unconstrained conditional estimates, say
\widehat{\theta}_r(\boldsymbol{\nu})
,
\widehat{\theta}_s(\boldsymbol{\nu}) -
\widehat{\theta}_r(\boldsymbol{\nu})
and
\widehat{\log(K_0)}(\boldsymbol{\nu})
can be
easily obtained from the normal equations of the respective (weighted)
least squares problems, and quadratic programming yields conditional (weighted)
least squares estimates that honour the inequality constraints 0
\le \theta_r \le \theta_s \le 1
.
Let
\widehat{\boldsymbol{\mu}}(\boldsymbol{\nu})^\mathrm{T} =
(
\widehat{\theta}_r(\boldsymbol{\nu}),
\widehat{\theta}_s(\boldsymbol{\nu}),
\widehat{\log(K_0)}(\boldsymbol{\nu})
)
be the conditional estimates of the linear parameters
obtained by minimising
Q_\theta^*(\theta_r, \theta_s;
\boldsymbol{\theta},
\boldsymbol{h}, \boldsymbol{\nu})
,
and
Q_K^*(K_0;
\boldsymbol{K},
\boldsymbol{h}, \boldsymbol{\nu})
, respectively.
fit_wrc_hcc
then estimates the
nonlinear parameters by minimising
Q(
\widehat{\boldsymbol{\mu}}(\boldsymbol{\nu}),
\boldsymbol{\nu};
\boldsymbol{\theta}, \boldsymbol{K},
\boldsymbol{h})
with respect to \boldsymbol{\nu}^\mathrm{}
by a nonlinear optimisation algorithm.
For mpd and ml estimation the variances of the model errors are estimated by (Stewart et al., 1992, eq. 16)
\widehat{\sigma}_\theta^2 = \frac{Q_{\theta}(
\widehat{\boldsymbol{\mu}}(\widehat{\boldsymbol{\nu}}),
\widehat{\boldsymbol{\nu}};
\boldsymbol{\theta},
\boldsymbol{h})}{\kappa_\theta},
and
\widehat{\sigma}_K^2 = \frac{Q_{K}(
\widehat{\boldsymbol{\mu}}(\widehat{\boldsymbol{\nu}}),
\widehat{\boldsymbol{\nu}};
\boldsymbol{K},
\boldsymbol{h})}{\kappa_K}.
Furthermore, for mpd and ml estimation, the covariance matrix of the estimated
nonlinear parameters may be approximated by the inverse Hessian matrix
of
Q(\widehat{\boldsymbol{\mu}}(\boldsymbol{\nu}),
\boldsymbol{\nu};
\boldsymbol{\theta}, \boldsymbol{K},
\boldsymbol{h})
at the solution
\widehat{\boldsymbol{\nu}}
(Stewart and Sørensen, 1981), i.e.
\mathrm{Cov}[
\widehat{\boldsymbol{\nu}},
\widehat{\boldsymbol{\nu}}^T] \approx
\boldsymbol{A}^{-1},
where
[\boldsymbol{A}]_{kl} =
\frac{\partial^2}{\partial\nu_k\, \partial\nu_l} \left.
Q(\widehat{\boldsymbol{\mu}}(\boldsymbol{\nu}),
\boldsymbol{\nu};
\boldsymbol{\theta}, \boldsymbol{K},
\boldsymbol{h})\right|_{\nu=\hat{\nu}}.
Details on parameter estimation
Models for water retention curves and hydraulic conductivity functions
Currently, fit_wrc_hcc
allows to estimate the parameters of the
simplified form of the Van Genuchten-Mualem (VGM) model (Van
Genuchten, 1980) with the restriction m = 1 - \frac{1}{n}
, see wc_model
and hc_model
.
This model has the following parameters:
-
\boldsymbol{\mu}^\mathrm{T} = (\theta_r, \theta_s, K_0)
(see above) and -
\boldsymbol{\nu}^\mathrm{T} = (\alpha, n, \tau)
where\alpha
is the inverse air entry pressure,n
the shape and\tau
the tortuosity parameter.
Any of these parameters can be either estimated from data or kept
fixed at the specified initial values (see arguments param
and
fit_param
of fit_wrc_hcc
).
Imposing physical constraints on the estimated parameters
Parameters of models for the water retention curve and the hydraulic
conductivity function may vary only within certain bounds (see
wc_model
, hc_model
and
param_boundf
for allowed ranges).
fit_wrc_hcc
either estimates transformed
parameters that vary over the whole real line and can therefore be
estimated without constraints (see param_transf
), or it
uses algorithms (quadratic programming for estimating
\boldsymbol{\mu}^\mathrm{}
, nonlinear optimisation algorithms with box
constraints for estimating \boldsymbol{\nu}^\mathrm{}
) that restrict estimates
to permissible ranges, see Details section of
control_fit_wrc_hcc
.
In addition, for natural soils, the parameters of the VGM model
cannot vary independently from each other over the allowed ranges.
Sets of fitted parameters should always result in soil hydraulic
quantities that are physically meaningful. One of these quantities
is the characteristic length L_\mathrm{c}
of
stage-I evaporation from a soil (Lehmann et al., 2008).
L_\mathrm{c}
can be related to the parameters of the VGM
model, see Lehmann et al. (2008, 2020) and
evaporative-length
.
Using several soil hydrological databases, Lehmann et al. (2020)
analysed the mutual dependence of VGM parameters and proposed
regression equations to relate the inverse air entry pressure
\alpha
and the saturated hydraulic K_0
to the shape
parameter n
, which characterises the width of the pore size
distribution of a soil. Using these relations, Lehmann et al.
(2020) then computed the expected value (“target”)
L_\mathrm{t}
of L_\mathrm{c}
for given n
and tortuosity parameter \tau
, see
evaporative-length
. fit_wrc_hcc
allows
to constrain estimates of the nonlinear parameters \boldsymbol{\nu}^\mathrm{}
by defining permissible lower and upper bounds for the ratio
L_\mathrm{c}/L_\mathrm{t}
, see arguments
ratio_lc_lt_bound
of fit_wrc_hcc
and
settings
of control_fit_wrc_hcc
.
Choice of optimisation algorithm for estimating the nonlinear parameters
To estimate \boldsymbol{\nu}^\mathrm{}
, fit_wrc_hcc
minimises
Q(
\widehat{\boldsymbol{\mu}}(\boldsymbol{\nu}),
\boldsymbol{\nu};
\boldsymbol{\theta}, \boldsymbol{K},
\boldsymbol{h})
either by a nonlinear optimisation algorithm available in the library
NLopt (Johnson, see nloptr
) or by the Shuffled
Complex Evolution (SCE) optimisation algorithm (Duan et al., 1994,
see SCEoptim
). The choice of the algorithm
is controlled by the argument settings
of the function
control_fit_wrc_hcc
:
global optimisation without constraints for the ratio
L_\mathrm{c}/L_\mathrm{t}
(settings = "uglobal"
orsettings = "sce"
),global optimisation with inequality constraints for the ratio
L_\mathrm{c}/L_\mathrm{t}
(settings = "cglobal"
),local optimisation without constraints for the ratio
L_\mathrm{c}/L_\mathrm{t}
(settings = "ulocal"
),local optimisation with inequality constraints for the ratio
L_\mathrm{c}/L_\mathrm{t}
(settings = "clocal"
).
The settings
argument also sets reasonable default values
for the termination (= convergence) criteria for the various
algorithms, see
NLopt
documentation, section Termination conditions. The NLopt
documentation contains a very useful discussion of
(constrained)
optimisation problems in general,
global
vs. local optimisation and
gradient-based
vs. derivative-free algorithms.
Note that besides the settings
argument of
control_fit_wrc_hcc
, the arguments nloptr
and
sce
along with the functions control_nloptr
and
control_sce
allow to fully control the nonlinear
optimisation algorithms, see control_fit_wrc_hcc
for
details.
Computing initial values of parameters
For local optimisation algorithms “good” initial values of
\boldsymbol{\nu}^\mathrm{}
are indispensable for successful estimation.
fit_wrc_hcc
allows to compute initial values of
\alpha
and n
for the Van Genuchten model from water
retention data by the following procedure:
Smooth the water retention data,
(\theta_i, y_i= \log(h_i)), i=1,2, ... n_\theta,
by an additive model.Determine the saturation,
S^*
, and the logarithm of capillary pressure head,y^* = \log(h^*)
, at the inflection point of the additive model fit.Find the root, say
\widehat{m}
, ofS^* = (1 + 1/m)^{-m}
. One obtains the right-hand side of this equation by solving\frac{\partial^2}{\partial y^2} \left[ S_\mathrm{VG}(\exp(y); \boldsymbol{\nu}) \right] = 0
fory
and plugging the result into the expression forS_\mathrm{VG}(\exp(y); \boldsymbol{\nu}),
seewc_model
.Compute
\widehat{n} = 1 / (1 - \widehat{m})
and\widehat{\alpha} = 1 / \exp(y^*) \,(1/\widehat{m})^{1-\widehat{m}}.
The second expression is again a result of solving\frac{\partial^2}{\partial y^2} \left[ S_\mathrm{VG}(\exp(y); \boldsymbol{\nu}) \right] = 0.
Initial values for local optimisation algorithms can of course also
be obtained by first estimating the parameters by a global
algorithm. These estimates can be “refined” in a second
step by a local unconstrained algorithm, possibly
followed by a third call of fit_wrc_hcc
to constrain
the estimated parameters by the ratio
L_\mathrm{c}/L_\mathrm{t}
. The method
coef.fit_wrc_hcc
can be used to extract the estimated
parameters from an object of class fit_wrc_hcc
and to pass
them as initial values to fit_wrc_hcc
, see
fit_wrc_hcc
for examples.
Author(s)
Andreas Papritz papritz@retired.ethz.ch.
References
Bates, D. M., and Watts, D. G. (1988) Nonlinear Regression Analysis and its Applications. John Wiley & Sons, New York doi:10.1002/9780470316757.
Duan, Q., Sorooshian, S., and Gupta, V. K. (1994) Optimal use of the SCE-UA global optimisation method for calibrating watershed models, Journal of Hydrology 158, 265–284, doi:10.1016/0022-1694(94)90057-4.
Johnson, S.G. The NLopt nonlinear-optimisation package. https://github.com/stevengj/nlopt.
Lehmann, P., Assouline, S., Or, D. (2008) Characteristic lengths affecting evaporative drying of porous media. Physical Review E, 77, 056309, doi:10.1103/PhysRevE.77.056309.
Lehmann, P., Bickel, S., Wei, Z., Or, D. (2020) Physical Constraints for Improved Soil Hydraulic Parameter Estimation by Pedotransfer Functions. Water Resources Research 56, e2019WR025963, doi:10.1029/2019WR025963.
Stewart, W.E., Caracotsios, M. Sørensen, J.P. 1992. Parameter estimation from multiresponse data. AIChE Journal, 38, 641–650, doi:10.1002/aic.690380502.
Stewart, W.E. and Sørensen, J.P. (1981)
Bayesian estimation of common
parameters from multiresponse data with missing observations.
Technometrics, 23, 131–141,
doi:10.1080/00401706.1981.10486255.
Van Genuchten, M. Th. (1980) A closed-form equation for predicting the hydraulic conductivity of unsaturated soils. Soil Science Society of America Journal, 44, 892–898, doi:10.2136/sssaj1980.03615995004400050002x.
See Also
fit_wrc_hcc
for (constrained) estimation of parameters of
models for soil water retention and hydraulic conductivity data;
control_fit_wrc_hcc
for options to control
fit_wrc_hcc
;
soilhypfitmethods
for common S3 methods for class
fit_wrc_hcc
;
vcov
for computing (co-)variances of the estimated
nonlinear parameters;
prfloglik_sample
for profile loglikelihood
computations;
wc_model
and hc_model
for currently
implemented models for soil water retention curves and hydraulic
conductivity functions;
evaporative-length
for physically constraining parameter
estimates of soil hydraulic material functions.