fit_growth {biogrowth} | R Documentation |
Fitting microbial growth
Description
This function provides a top-level interface for fitting growth models to data describing the variation of the population size through time, either under constant or dynamic environment conditions. See below for details on the calculations.
Usage
fit_growth(
fit_data,
model_keys,
start,
known,
environment = "constant",
algorithm = "regression",
approach = "single",
env_conditions = NULL,
niter = NULL,
...,
check = TRUE,
logbase_mu = logbase_logN,
logbase_logN = 10,
formula = logN ~ time
)
Arguments
fit_data |
observed microbial growth. The format varies depending on the type of model fit. See the relevant sections (and examples) below for details. |
model_keys |
a named list assigning equations for the primary and secondary models. See the relevant sections (and examples) below for details. |
start |
a named numeric vector assigning initial guesses to the model parameters to estimate from the data. See relevant section (and examples) below for details. |
known |
named numeric vector of fixed model parameters, using the same conventions as for "start". |
environment |
type of environment. Either "constant" (default) or "dynamic" (see below for details on the calculations for each condition) |
algorithm |
either "regression" (default; Levenberg-Marquard algorithm) or "MCMC" (Adaptive Monte Carlo algorithm). |
approach |
approach for model fitting. Either "single" (the model is fitted to a unique experiment) or "global" (the model is fitted to several dynamic experiments). |
env_conditions |
Tibble describing the variation of the environmental conditions for dynamic experiments. See the relevant sections (and examples) below for details. Ignored for environment="constant". |
niter |
number of iterations of the MCMC algorithm. Ignored when algorithm!="MCMC". |
... |
Additional arguments for |
check |
Whether to check the validity of the models. TRUE by default. |
logbase_mu |
Base of the logarithm the growth rate is referred to. By default, the same as logbase_logN. See vignette about units for details. |
logbase_logN |
Base of the logarithm for the population size. By default, 10 (i.e. log10). See vignette about units for details. |
formula |
An object of class "formula" defining the names of the x and y variables in
the data. |
Value
If approach="single
, an instance of GrowthFit. If approach="multiple"
,
an instance of GlobalGrowthFit
Please check the help pages of each class for additional information.
Fitting under constant conditions
When environment="constant", the functions fits a primary growth model to the population size observed during an experiment. In this case, the data has to be a tibble (or data.frame) with two columns:
time: the elapsed time
logN: the logarithm of the observed population size Nonetheless, the names of the columns can be modified with the formula argument.
The model equation is defined through the model_keys argument. It must include
an entry named "primary" assigned to a model. Valid model keys can be retrieved
calling primary_model_data()
.
The model is fitted by non-linear regression (using modFit()
). This algorithm
needs initial guesses for every model parameter. This are defined as a named numeric
vector. The names must be valid model keys, which can be retrieved using primary_model_data()
(see example below). Apart from that, any model parameter can be fixed using the
"known" argument. This is a named numeric vector, with the same convenctions as "start".
Fitting under dynamic conditions to a single experiment
When environment="constant" and approach="single", a dynamic growth model combining the Baranyi primary growth model with the gamma approach for the effect of the environmental conditions on the growth rate is fitted to an experiment gathered under dynamic conditions. In this case, the data is similar to fitting under constant conditions: a tibble (or data.frame) with two columns:
time: the elapsed time
logN: the logarithm of the observed population size Note that these default names can be changed using the formula argument.
The values of the experimental conditions during the experiment are defined using the "env_conditions" argument. It is a tibble (or data.frame) with one column named ("time") defining the elapsed time. Note that this default name can be modified using the formula argument of the function. The tibble needs to have as many additional columns as environmental conditions included in the model, providing the values of the environmental conditions.
The model equations are defined through the model_keys argument. It must be a named
list where the names match the column names of "env_conditions" and the values
are model keys. These can be retrieved using secondary_model_data()
.
The model can be fitted using regression (modFit()
) or an adaptive Monte Carlo
algorithm (modMCMC()
). Both algorithms require initial guesses for every model
parameter to fit. These are defined through the named numeric vector "start". Each
parameter must be named as factor+"_"+parameter, where factor is the name of the
environmental factor defined in "model_keys". The parameter is a valid key
that can be retrieved from secondary_model_data()
. For instance, parameter Xmin for
the factor temperature would be defined as "temperature_xmin".
Note that the argument ... allows passing additional arguments to the fitting functions.
Fitting under dynamic conditions to multiple experiments (global fitting)
When environment="constant" and approach="global", fit_growth tries to find the vector of model parameters that best describe the observations of several growth experiments.
The input requirements are very similar to the case when approach="single". The models (equations, initial guesses, known parameters, algorithms...) are identical. The only difference is that "fit_data" must be a list, where each element describes the results of an experiment (using the same conventions as when approach="single"). In a similar fashion, "env_conditions" must be a list describing the values of the environmental factors during each experiment. Although it is not mandatory, it is recommended that the elements of both lists are named. Otherwise, the function assigns automatically-generated names, and matches them by order.#'
Examples
## Example 1 - Fitting a primary model --------------------------------------
## A dummy dataset describing the variation of the population size
my_data <- data.frame(time = c(0, 25, 50, 75, 100),
logN = c(2, 2.5, 7, 8, 8))
## A list of model keys can be gathered from
primary_model_data()
## The primary model is defined as a list
models <- list(primary = "Baranyi")
## The keys of the model parameters can also be gathered from primary_model_data
primary_model_data("Baranyi")$pars
## Any model parameter can be fixed
known <- c(mu = .2)
## The remaining parameters need initial guesses
start <- c(logNmax = 8, lambda = 25, logN0 = 2)
primary_fit <- fit_growth(my_data, models, start, known,
environment = "constant",
)
## The instance of FitIsoGrowth includes several useful methods
print(primary_fit)
plot(primary_fit)
coef(primary_fit)
summary(primary_fit)
## time_to_size can be used to calculate the time for some concentration
time_to_size(primary_fit, 4)
## Example 2 - Fitting under dynamic conditions------------------------------
## We will use the example data included in the package
data("example_dynamic_growth")
## And the example environmental conditoins (temperature & aw)
data("example_env_conditions")
## Valid keys for secondary models can be retrived from
secondary_model_data()
## We need to assign a model equation (secondary model) to each environmental factor
sec_models <- list(temperature = "CPM", aw = "CPM")
## The keys of the model parameters can be gathered from the same function
secondary_model_data("CPM")$pars
## Any model parameter (of the primary or secondary models) can be fixed
known_pars <- list(Nmax = 1e4, # Primary model
N0 = 1e0, Q0 = 1e-3, # Initial values of the primary model
mu_opt = 4, # mu_opt of the gamma model
temperature_n = 1, # Secondary model for temperature
aw_xmax = 1, aw_xmin = .9, aw_n = 1 # Secondary model for water activity
)
## The rest, need initial guesses (you know, regression)
my_start <- list(temperature_xmin = 25, temperature_xopt = 35,
temperature_xmax = 40, aw_xopt = .95)
## We can now fit the model
dynamic_fit <- fit_growth(example_dynamic_growth,
sec_models,
my_start, known_pars,
environment = "dynamic",
env_conditions = example_env_conditions
)
## The instance of FitDynamicGrowth has several S3 methods
plot(dynamic_fit, add_factor = "temperature")
summary(dynamic_fit)
## We can use time_to_size to calculate the time required to reach a given size
time_to_size(dynamic_fit, 3)
## Example 3- Fitting under dynamic conditions using MCMC -------------------
## We can reuse most of the arguments from the previous example
## We just need to define the algorithm and the number of iterations
set.seed(12421)
MCMC_fit <- fit_growth(example_dynamic_growth,
sec_models,
my_start, known_pars,
environment = "dynamic",
env_conditions = example_env_conditions,
algorithm = "MCMC",
niter = 1000
)
## The instance of FitDynamicGrowthMCMC has several S3 methods
plot(MCMC_fit, add_factor = "aw")
summary(MCMC_fit)
## We can use time_to_size to calculate the time required to reach a given size
time_to_size(MCMC_fit, 3)
## It can also make growth predictions including uncertainty
uncertain_growth <- predictMCMC(MCMC_fit,
seq(0, 10, length = 1000),
example_env_conditions,
niter = 1000)
## The instance of MCMCgrowth includes several nice S3 methods
plot(uncertain_growth)
print(uncertain_growth)
## time_to_size can calculate the time to reach some count
time_to_size(uncertain_growth, 2)
time_to_size(uncertain_growth, 2, type = "distribution")
## Example 4 - Fitting a unique model to several dynamic experiments --------
## We will use the data included in the package
data("multiple_counts")
data("multiple_conditions")
## We need to assign a model equation for each environmental factor
sec_models <- list(temperature = "CPM", pH = "CPM")
## Any model parameter (of the primary or secondary models) can be fixed
known_pars <- list(Nmax = 1e8, N0 = 1e0, Q0 = 1e-3,
temperature_n = 2, temperature_xmin = 20,
temperature_xmax = 35,
pH_n = 2, pH_xmin = 5.5, pH_xmax = 7.5, pH_xopt = 6.5)
## The rest, need initial guesses
my_start <- list(mu_opt = .8, temperature_xopt = 30)
## We can now fit the model
global_fit <- fit_growth(multiple_counts,
sec_models,
my_start,
known_pars,
environment = "dynamic",
algorithm = "regression",
approach = "global",
env_conditions = multiple_conditions
)
## The instance of FitMultipleDynamicGrowth has nice S3 methods
plot(global_fit)
summary(global_fit)
print(global_fit)
## We can use time_to_size to calculate the time to reach a given size
time_to_size(global_fit, 4.5)
## Example 5 - MCMC fitting a unique model to several dynamic experiments ---
## Again, we can re-use all the arguments from the previous example
## We just need to define the right algorithm and the number of iterations
## On top of that, we will also pass upper and lower bounds to modMCMC
set.seed(12421)
global_MCMC <- fit_growth(multiple_counts,
sec_models,
my_start,
known_pars,
environment = "dynamic",
algorithm = "MCMC",
approach = "global",
env_conditions = multiple_conditions,
niter = 1000,
lower = c(.2, 29), # lower limits of the model parameters
upper = c(.8, 34) # upper limits of the model parameters
)
## The instance of FitMultipleDynamicGrowthMCMC has nice S3 methods
plot(global_MCMC)
summary(global_MCMC)
print(global_MCMC)
## We can use time_to_size to calculate the time to reach a given size
time_to_size(global_MCMC, 3)
## It can also be used to make model predictions with parameter uncertainty
uncertain_prediction <- predictMCMC(global_MCMC,
seq(0, 50, length = 1000),
multiple_conditions[[1]],
niter = 100
)
## The instance of MCMCgrowth includes several nice S3 methods
plot(uncertain_growth)
print(uncertain_growth)
## time_to_size can calculate the time to reach some count
time_to_size(uncertain_growth, 2)
time_to_size(uncertain_growth, 2, type = "distribution")