profile_loglikelihood {soilhypfit} | R Documentation |
Profile Loglikelihoods and Confidence Intervals for Parametric Modelling of Soil Hydraulic Properties
Description
The function prfloglik_sample
computes for a single soil sample a
loglikelihood profile as a function of the specified values for subsets
of model parameters of soil water retention and/or soil hydraulic
conductivity functions. The function confint_prfloglik_sample
computes a confidence interval of one model parameter based on the
likelihood ratio test for a single soil sample, and the S3 method
confint
computes confidence intervals of nonlinear
model parameters for multiple soil samples.
Usage
prfloglik_sample(object, values, soil_sample,
ncores = min(detectCores() - 1L, NROW(values)), verbose = 0)
confint_prfloglik_sample(object, parm = names(default_fit_param()),
soil_sample, level = 0.95, test = c("F", "Chisq"),
denominator_df = c("nonlinear", "all"), param_bound = NULL,
root_tol = .Machine$double.eps^0.25, froot_tol = sqrt(root_tol),
verbose = 0)
## S3 method for class 'fit_wrc_hcc'
confint(object,
parm = names(object[["control"]][["initial_param"]]), level = 0.95,
subset = NULL, type = c("loglik", "normal"), test = c("F", "Chisq"),
denominator_df = c("nonlinear", "all"),
root_tol = .Machine$double.eps^0.25, froot_tol = sqrt(root_tol),
ncores = detectCores() - 1L, verbose = 0, ...)
Arguments
object |
an object of class |
values |
a |
soil_sample |
a character scalar with the label of the soil sample
for which the loglikelihood profile or the confidence interval should be
computed. If |
ncores |
an integer defining the number of cores for parallel
computations. |
verbose |
positive integer controlling logging of diagnostic
messages to the console and plotting of data and model curves
during fitting, see |
parm |
character scalar ( |
level |
numeric scalar with the confidence level required to compute the confidence interval. |
test |
character keyword specifying whether to use the asymptotic
|
denominator_df |
character keyword specifying whether the
denominator degrees of freedom for the F-distribution of the test
statistic is equal to the number of estimated nonlinear parameters
( |
param_bound |
a numeric vector of length 2 with the allowed range of
the parameter for which the confidence interval is computed. The limits
of the confidence interval are searched within this range, see
Details. When equal to |
root_tol |
a numeric scalar defining the desired accuracy (convergence
tolerance) for root finding by |
froot_tol |
a numeric scalar defining the desired accuracy (function value tolerance) for deciding whether a root has been found. |
subset |
an integer, character or logical vector to the choose the
soil samples for which confidence intervals should be computed. Defaults
to |
type |
character keyword specifying whether to compute confidence
intervals based on the likelihood ratio test ( |
... |
additional arguments passed to methods, currently not used. |
Details
Computing likelihood profiles
The function prfloglik_sample
computes the loglikelihood profile
for a subset of the nonlinear (\boldsymbol{\nu}^\mathrm{}
) (and linear
\boldsymbol{\mu}^\mathrm{}
) model parameters. We denote the profiling
parameters of interest by \boldsymbol{\phi}
and the nuisance parameters that are estimated when computing the
profile by \boldsymbol{\psi}
, so that
(
\boldsymbol{\phi}^\mathrm{T},
\boldsymbol{\psi}^\mathrm{T}
)^\mathrm{T}
is a rearranged version of the parameter vector
(
\boldsymbol{\mu}^\mathrm{T},
\boldsymbol{\nu}^\mathrm{T}
)^\mathrm{T}
, see soilhypfitIntro
.
prfloglik_sample
computes the estimates
\widehat{\boldsymbol{\psi}}(\boldsymbol{\phi})
of \boldsymbol{\psi}
and the profile loglikelihood
Q(\boldsymbol{\phi},
\widehat{\boldsymbol{\psi}}(\boldsymbol{\phi});
\boldsymbol{\theta}, \boldsymbol{K},
\boldsymbol{h}
)
as a function of \boldsymbol{\phi}
.
To compute the estimates,
\boldsymbol{\psi}
is partitioned into the nonlinear and conditionally linear parameters, see
soilhypfitIntro
for details.
prfloglik_sample
uses the packages parallel and
snowfall for parallelized computation of loglikelihood
profiles.
Computing likelihood based confidence intervals
The function confint_prfloglik_sample
computes the
confidence interval of a single model parameter
\phi \in (
\boldsymbol{\mu}^\mathrm{T},
\boldsymbol{\nu}^\mathrm{T}
)^\mathrm{T}
based on the likelihood ratio test.
The interval is computed by assuming either
that the test statistic
T(\phi) = \Delta Q(\phi) = 2( Q(\widehat{\phi}, \widehat{\boldsymbol{\psi}}; \boldsymbol{\theta}, \boldsymbol{K}, \boldsymbol{h} ) - Q(\phi, \widehat{\boldsymbol{\psi}}(\phi); \boldsymbol{\theta}, \boldsymbol{K}, \boldsymbol{h} ))
follows a\chi^2_1
-distribution with 1 degrees of freedom (possible choice when both water retention and/or hydraulic conductivity data were used to estimate the parameters), orthat the transformed test statistic
T(\phi) = \left(\exp(\Delta Q(\phi) / n) - 1 \right) (n - p)
follows anF(1,n-p)
-distribution wheren
is the number of water content or hydraulic conductivity measurements, andp
is the number of estimated parameters (see Uusipaikka, 2008, p. 115, for details).denominator_df
allows to control howp
is chosen. Note that this test distribution can only be chosen (and is then the default) when only water retention or only hydraulic conductivty data were used to estimate the parameters.
confint_prfloglik_sample
computes profile loglikelihoods
Q(\phi,
\widehat{\boldsymbol{\psi}}(\phi);
\boldsymbol{\theta}, \boldsymbol{K},
\boldsymbol{h}
)
by
prfloglik_sample
and then uses uniroot
to search the roots
of the equation
f(\phi) = q_T(\gamma)- T(\phi)
in the interval defined by param_bound
. q_T(\gamma)
is
the \gamma
-quantile of the chosen test distribution for T
.
Computing confidence intervals for several soil samples
The confint
method computes 2 types of confidence intervals (in
dependence of type
) of only the nonlinear parameters
\boldsymbol{\nu}^\mathrm{}
, possibly for
multiple soil samples at the same time:
intervals based on the asymptotic normal distribution of maximum likelihood estimates with standard errors computed by the
vcov
method for classfit_wrc_hcc
(seevcov.fit_wrc_hcc
,intervals based on the likelihood ratio test by
confint_prfloglik_sample
. The intervals for several soil samples are computed in parallel.
Requirements for computing loglikelihood profiles
The parameters contained in object
must be estimated by maximum
likelihood (method = "ml"
, see soilhypfitIntro
and
control_fit_wrc_hcc
. Use of other estimation methods
results an error.
Preferably an unconstrained local algorithm (settings =
"ulocal"
, see soilhypfitIntro
and
control_fit_wrc_hcc
)) is used for minimizing the negative
loglikelihood when generating object
. Use of other algorithms
generates a warning message.
Value
The function prfloglik_sample
returns a data.frame
with the columns of values
(parameters of interest
\boldsymbol{\phi}
),
a column loglik
with the
maximized profile loglikelihood
Q(\phi,
\widehat{\boldsymbol{\psi}}(\phi);
\boldsymbol{\theta}, \boldsymbol{K},
\boldsymbol{h}
)
, columns with the estimated parameters
\widehat{\boldsymbol{\psi}}(\boldsymbol{\phi})
,
columns with the gradient of the loglikelihood with respect to the
estimated nonlinear parameters (missing if all nonlinear parameters were
fixed) and a column converged
, indicating whether convergence has
occurred (see convergence_message
) when estimating the
nonlinear parameters (equal to NA
when all nonlinear parameters
are fixed).
The function confint_prfloglik_sample
returns a numeric
vector of length 2 with the lower and upper limits of the confidence
interval. If no roots were found then NA
is returned. The
returned result further contains as attribute list prfloglik
the parameter estimate \widehat{\phi}
(param_estimate
), the maximized loglikelihood
Q(\widehat{\phi},
\widehat{\boldsymbol{\psi}};
\boldsymbol{\theta}, \boldsymbol{K},
\boldsymbol{h}
)
(loglik
), the quantile of the test distribution
q_{\mathrm{test}}(\gamma)
(qtest
),
the type of test distribution used (test
),
the significance level
, the number
of water content (nobs_wrc
) and conductivity measurements
(nobs_hcc
) and the function values of f(\phi)
evaluated at the roots returned by uniroot
.
The method confint
returns a dataframe with the lower and upper
limits of the confidence intervals for the estimated nonlinear
parameters.
Author(s)
Andreas Papritz papritz@retired.ethz.ch.
References
Uusipaikka, E. (2008) Confidence Intervals in Generalized Regression Models. Chapman & Hall/CRC Press, Boca Raton doi:10.1201/9781420060386.
See Also
soilhypfitIntro
for a description of the models and a brief
summary of the parameter estimation approach;
fit_wrc_hcc
for (constrained) estimation of parameters of
models for soil water retention and hydraulic conductivity data;
control_fit_wrc_hcc
for options to control
fit_wrc_hcc
;
soilhypfitmethods
for common S3 methods for class
fit_wrc_hcc
;
vcov
for computing (co-)variances of the estimated
nonlinear parameters;
wc_model
and hc_model
for currently
implemented models for soil water retention curves and hydraulic
conductivity functions;
evaporative-length
for physically constraining parameter
estimates of soil hydraulic material functions.
Examples
# use of \donttest{} because execution time exceeds 5 seconds
library(lattice)
data(sim_wrc_hcc)
# define number of cores for parallel computations
if(interactive()) ncpu <- parallel::detectCores() - 1L else ncpu <- 1L
# estimate parameters for 3 samples simultaneously by ...
# ... unconstrained, global optimisation algorithm NLOPT_GN_MLSL (default)
rfit_uglob <- fit_wrc_hcc(
wrc_formula = wc ~ head | id, hcc_formula = hc ~ head | id,
data = sim_wrc_hcc,
control = control_fit_wrc_hcc(param_bound = param_boundf(
alpha = c(0.00001, 50), n = c(1.0001, 7), tau = c(-1, 5)
), pcmp = control_pcmp(ncores = ncpu)))
# ... unconstrained, local optimisation algorithm NLOPT_LN_BOBYQA,
rfit_uloc <- update(
rfit_uglob,
param = as.matrix(coef(rfit_uglob, what = "nonlinear")),
control = control_fit_wrc_hcc(
settings = "ulocal", param_tf = param_transf(
alpha = "identity", n = "identity", tau = "identity"
), param_bound = param_boundf(
alpha = c(0.00001, 50), n = c(1.0001, 7), tau = c(-1, 5)
), pcmp = control_pcmp(ncores = ncpu)))
# extract estimated parameters for sample id == "1"
coef_id_1 <- unlist(coef(rfit_uloc, gof = TRUE, se = TRUE, subset = "1"))
# compute loglikelihood profile of parameter alpha for sample id == "1"
rprfloglik_alpha <- prfloglik_sample(
rfit_uloc, values = data.frame(alpha = seq(1.5, 3.0, length = 40L)),
soil_sample = "1", ncores = ncpu)
# plot loglikelihood profile along with 95% confidence intervals
plot(loglik ~ alpha, rprfloglik_alpha, type = "l")
abline(v = coef_id_1["alpha"])
# 95% confidence intervall based on likelihood ratio test
abline(h = -coef_id_1["obj"] - 0.5 * qchisq(0.95, 1), lty = "dashed")
# 95% confidence intervall based on asymptotic normal distribution
segments(
x0 = coef_id_1["alpha"] + qnorm(0.025) * coef_id_1["se.alpha"],
x1 = coef_id_1["alpha"] + qnorm(0.975) * coef_id_1["se.alpha"],
y0 = min(rprfloglik_alpha$loglik)
)
# compute loglikelihood profile of parameter n for sample id == "1"
rprfloglik_n <- prfloglik_sample(
rfit_uloc, values = data.frame(n = seq(1.7, 2.25, length = 40L)),
soil_sample = "1", ncores = ncpu
)
# plot loglikelihood profile along with 95% confidence intervals
plot(loglik ~ n, rprfloglik_n, type = "l")
abline(v = coef_id_1["n"])
# 95% confidence intervall based on likelihood ratio test
abline(h = -coef_id_1["obj"] - 0.5 * qchisq(0.95, 1), lty = "dashed")
# 95% confidence intervall based on asymptotic normal distribution
segments(
x0 = coef_id_1["n"] + qnorm(0.025) * coef_id_1["se.n"],
x1 = coef_id_1["n"] + qnorm(0.975) * coef_id_1["se.n"],
y0 = min(rprfloglik_n$loglik)
)
# compute loglikelihood profile of parameters alpha and n for sample id == "1"
rprfloglik_alpha_n <- prfloglik_sample(
rfit_uloc, values = expand.grid(
alpha = seq(1.5, 3.0, length = 40L), n = seq(1.7, 2.25, length = 40L)),
soil_sample = "1", ncores = ncpu
)
# joint confidence region of alpha and n based on likelihood ratio test
levelplot(loglik ~ alpha + n, rprfloglik_alpha_n,
panel = function(x, y, z, subscripts, ...){
panel.levelplot(x, y, z, subscripts, ...)
panel.levelplot(x, y, z, subscripts, region = FALSE, contour = TRUE,
at = -coef_id_1["obj"] - 0.5 * qchisq(0.95, 2),
lty = "solid"
)
panel.levelplot(x, y, z, subscripts, region = FALSE, contour = TRUE,
at = -coef_id_1["obj"] - 0.5 * qchisq(0.9, 2),
lty = "dashed"
)
panel.levelplot(x, y, z, subscripts, region = FALSE, contour = TRUE,
at = -coef_id_1["obj"] - 0.5 * qchisq(0.8, 2),
lty = "dotted"
)
panel.points(coef_id_1["alpha"], coef_id_1["n"], pch = 16)
panel.lines(
x = rprfloglik_alpha[, c("alpha", "n")], col = "blue"
)
panel.lines(
x = rprfloglik_n[, c("alpha", "n")], col = "magenta"
)
}, key = list(
corner = c(1, 0.97),
lines = list(
lty = c(rep("solid", 3), "dashed", "dotted"),
col = c("blue", "magenta", rep("black", 3))
),
text = list(c(
"estimate of n as a function of fixed alpha",
"estimate of alpha as a function of fixed n",
"95% joint confidence region",
"90% joint confidence region",
"80% joint confidence region"
))
))
# compute 95%-confidence interval
(ci.alpha <- confint_prfloglik_sample(
rfit_uloc, parm = "alpha", soil_sample = "1"
))
# use limits to draw loglikelihood profile for alpha
rprfloglik_alpha <- prfloglik_sample(
rfit_uloc, values = data.frame(
alpha = seq(0.9 * ci.alpha[1], 1.1 * ci.alpha[2], length = 40L)),
soil_sample = "1", ncores = ncpu)
plot(loglik ~ alpha, rprfloglik_alpha, type = "l")
lines(
ci.alpha,
rep(diff(unlist(attr(ci.alpha, "prfloglik")[c("qtest", "loglik")])) , 2)
)
abline(v = attr(ci.alpha, "prfloglik")["estimate"], lty = "dashed")
# 95%-confidence intervals of all nonlinear parameters based for all
# soil samples asymptotic normal distribution of maximum likelihood estimates
confint(rfit_uloc, type = "normal")
# 95%-confidence intervals of all nonlinear parameters based for all
# soil samples based on likelihood ratio test
confint(rfit_uloc, ncores = ncpu)