R: Fit non-linear growth curves

drda {drda}

R Documentation

Fit non-linear growth curves

Description

Use the Newton's with a trust-region method to fit non-linear growth curves to observed data.

Usage

drda(
  formula,
  data,
  subset,
  weights,
  na.action,
  mean_function = "logistic4",
  lower_bound = NULL,
  upper_bound = NULL,
  start = NULL,
  max_iter = 1000
)

Arguments

`formula`	an object of class `formula` (or one that can be coerced to that class): a symbolic description of the model to be fitted. Currently supports only formulas of the type `y ~ x`.
`data`	an optional data frame, list or environment (or object coercible by `as.data.frame` to a data frame) containing the variables in the model. If not found in `data`, the variables are taken from `environment(formula)`, typically the environment from which `drda` is called.
`subset`	an optional vector specifying a subset of observations to be used in the fitting process.
`weights`	an optional vector of weights to be used in the fitting process. If provided, weighted least squares is used with weights `weights` (that is, minimizing `sum(weights * residuals^2)`), otherwise ordinary least squares is used.
`na.action`	a function which indicates what should happen when the data contain `NA`s. The default is set by the `na.action` setting of `options`, and is `na.fail` if that is unset. The 'factory-fresh' default is `na.omit`. Another possible value is `NULL`, no action. Value `na.exclude` can be useful.
`mean_function`	the model to be fitted. See `details` for available models.
`lower_bound`	numeric vector with the minimum admissible values of the parameters. Use `-Inf` to specify an unbounded parameter.
`upper_bound`	numeric vector with the maximum admissible values of the parameters. Use `Inf` to specify an unbounded parameter.
`start`	starting values for the parameters.
`max_iter`	maximum number of iterations in the optimization algorithm.

Details

Available models

Generalized (5-parameter) logistic function

The 5-parameter logistic function can be selected by choosing mean_function = "logistic5" or mean_function = "l5". The function is defined here as

alpha + delta / (1 + nu * exp(-eta * (x - phi)))^(1 / nu)

where eta > 0 and nu > 0. When delta is positive (negative) the curve is monotonically increasing (decreasing).

Parameter alpha is the value of the function when x -> -Inf. Parameter delta is the (signed) height of the curve. Parameter eta represents the steepness (growth rate) of the curve. Parameter phi is related to the mid-value of the function. Parameter nu affects near which asymptote maximum growth occurs.

The value of the function when x -> Inf is alpha + delta. In dose-response studies delta can be interpreted as the maximum theoretical achievable effect.

4-parameter logistic function

The 4-parameter logistic function is the default model of drda. It can be explicitly selected by choosing mean_function = "logistic4" or mean_function = "l4". The function is obtained by setting nu = 1 in the generalized logistic function, that is

alpha + delta / (1 + exp(-eta * (x - phi)))

where eta > 0. When delta is positive (negative) the curve is monotonically increasing (decreasing).

Parameter alpha is the value of the function when x -> -Inf. Parameter delta is the (signed) height of the curve. Parameter eta represents the steepness (growth rate) of the curve. Parameter phi represents the x value at which the curve is equal to its mid-point, i.e. ⁠f(phi; alpha, delta, eta, phi) = alpha + delta / 2⁠.

The value of the function when x -> Inf is alpha + delta. In dose-response studies delta can be interpreted as the maximum theoretical achievable effect.

2-parameter logistic function

The 2-parameter logistic function can be selected by choosing mean_function = "logistic2" or mean_function = "l2". For a monotonically increasing curve set nu = 1, alpha = 0, and delta = 1:

1 / (1 + exp(-eta * (x - phi)))

For a monotonically decreasing curve set nu = 1, alpha = 1, and delta = -1:

1 - 1 / (1 + exp(-eta * (x - phi)))

where eta > 0. The lower bound of the curve is zero while the upper bound of the curve is one.

Parameter eta represents the steepness (growth rate) of the curve. Parameter phi represents the x value at which the curve is equal to its mid-point, i.e. ⁠f(phi; eta, phi) = 1 / 2⁠.

Gompertz function

The Gompertz function is the limit for nu -> 0 of the 5-parameter logistic function. It can be selected by choosing mean_function = "gompertz" or mean_function = "gz". The function is defined in this package as

alpha + delta * exp(-exp(-eta * (x - phi)))

where eta > 0.

The value of the function when x -> Inf is alpha + delta. In dose-response studies delta can be interpreted as the maximum theoretical achievable effect.

The mid-point of the function, that is alpha + delta / 2, is achieved at x = phi - log(log(2)) / eta.

Generalized (5-parameter) log-logistic function

The 5-parameter log-logistic function is selected by setting mean_function = "loglogistic5" or mean_function = "ll5". The function is defined here as

alpha + delta * (x^eta / (x^eta + nu * phi^eta))^(1 / nu)

where x >= 0, eta > 0, phi > 0, and nu > 0. When delta is positive (negative) the curve is monotonically increasing (decreasing). The function is defined only for positive values of the predictor variable x.

Parameter alpha is the value of the function at x = 0. Parameter delta is the (signed) height of the curve. Parameter eta represents the steepness (growth rate) of the curve. Parameter phi is related to the mid-value of the function. Parameter nu affects near which asymptote maximum growth occurs.

The value of the function when x -> Inf is alpha + delta. In dose-response studies delta can be interpreted as the maximum theoretical achievable effect.

4-parameter log-logistic function

The 4-parameter log-logistic function is selected by setting mean_function = "loglogistic4" or mean_function = "ll4". The function is obtained by setting nu = 1 in the generalized log-logistic function, that is

alpha + delta * x^eta / (x^eta + phi^eta)

where x >= 0 and eta > 0. When delta is positive (negative) the curve is monotonically increasing (decreasing). The function is defined only for positive values of the predictor variable x.

Parameter alpha is the value of the function at x = 0. Parameter delta is the (signed) height of the curve. Parameter eta represents the steepness (growth rate) of the curve. Parameter phi represents the x value at which the curve is equal to its mid-point, i.e. ⁠f(phi; alpha, delta, eta, phi) = alpha + delta / 2⁠.

The value of the function when x -> Inf is alpha + delta. In dose-response studies delta can be interpreted as the maximum theoretical achievable effect.

2-parameter log-logistic function

The 2-parameter log-logistic function is selected by setting mean_function = "loglogistic2" or mean_function = "ll2". For a monotonically increasing curve set nu = 1, alpha = 0, and delta = 1:

x^eta / (x^eta + phi^eta)

For a monotonically decreasing curve set nu = 1, alpha = 1, and delta = -1:

1 - x^eta / (x^eta + phi^eta)

where x >= 0, eta > 0, and phi > 0. The lower bound of the curve is zero while the upper bound of the curve is one.

Parameter eta represents the steepness (growth rate) of the curve. Parameter phi represents the x value at which the curve is equal to its mid-point, i.e. ⁠f(phi; eta, phi) = 1 / 2⁠.

log-Gompertz function

The log-Gompertz function is the limit for nu -> 0 of the 5-parameter log-logistic function. It can be selected by choosing mean_function = "loggompertz" or mean_function = "lgz". The function is defined in this package as

alpha + delta * exp(-(phi / x)^eta)

where x > 0, eta > 0, and phi > 0. Note that the limit for x -> 0 is alpha. When delta is positive (negative) the curve is monotonically increasing (decreasing). The function is defined only for positive values of the predictor variable x.

The value of the function when x -> Inf is alpha + delta. In dose-response studies delta can be interpreted as the maximum theoretical achievable effect.

Constrained optimization

It is possible to search for the maximum likelihood estimates within pre-specified interval regions.

Note: Hypothesis testing is not available for constrained estimates because asymptotic approximations might not be valid.

Value

An object of class drda and model_fit, where model is the chosen mean function. It is a list containing the following components:

converged: boolean value assessing if the optimization algorithm converged or not.
iterations: total number of iterations performed by the optimization algorithm
constrained: boolean value set to TRUE if optimization was constrained.
estimated: boolean vector indicating which parameters were estimated from the data.
coefficients: maximum likelihood estimates of the model parameters.
rss: minimum value (found) of the residual sum of squares.
df.residuals: residual degrees of freedom.
fitted.values: fitted mean values.
residuals: residuals, that is response minus fitted values.
weights: (only for weighted fits) the specified weights.
mean_function: model that was used for fitting.
n: effective sample size.
sigma: corrected maximum likelihood estimate of the standard deviation.
loglik: maximum value (found) of the log-likelihood function.
fisher.info: observed Fisher information matrix evaluated at the maximum likelihood estimator.
vcov: approximate variance-covariance matrix of the model parameters.
call: the matched call.
terms: the terms object used.
model: the model frame used.
na.action: (where relevant) information returned by model.frame on the special handling of NAs.

Examples

# by default `drda` uses a 4-parameter logistic function for model fitting
fit_l4 <- drda(response ~ log_dose, data = voropm2)

# get a general overview of the results
summary(fit_l4)

# compare the model against a flat horizontal line and the full model
anova(fit_l4)

# 5-parameter logistic curve appears to be a better model
fit_l5 <- drda(response ~ log_dose, data = voropm2, mean_function = "l5")
plot(fit_l4, fit_l5)

# fit a 2-parameter logistic function
fit_l2 <- drda(response ~ log_dose, data = voropm2, mean_function = "l2")

# compare our models
anova(fit_l2, fit_l4)

# use log-logistic functions when utilizing doses (instead of log-doses)
# here we show the use of other arguments as well
fit_ll5 <- drda(
  response ~ dose, weights = weight, data = voropm2,
  mean_function = "loglogistic5", lower_bound = c(0.5, -1.5, 0, -Inf, 0.25),
  upper_bound = c(1.5, 0.5, 5, Inf, 3), start = c(1, -1, 3, 100, 1),
  max_iter = 10000
)

# note that the maximum likelihood estimate is outside the region of
# optimization: not only the variance-covariance matrix is now singular but
# asymptotic assumptions do not hold anymore.

[Package drda version 2.0.3 Index]