R: Fit a kinetic model using a linear model.

FitKineticsGeneLogSpaceLinear {grandR}

R Documentation

Fit a kinetic model using a linear model.

Description

Fit the standard mass action kinetics model of gene expression using a linear model after log-transforming the observed values (i.e. assuming gaussian homoscedastic errors of the logarithmized values) for the given gene. The fit takes only old RNA into account and requires proper normalization, but can be performed without assuming steady state for the degradation rate. The parameters are fit per Condition.

Usage

FitKineticsGeneLogSpaceLinear(
  data,
  gene,
  slot = DefaultSlot(data),
  time = Design$dur.4sU,
  CI.size = 0.95
)

Arguments

`data`	A grandR object
`gene`	The gene for which to fit the model
`slot`	The data slot to take expression values from
`time`	The column in the column annotation table representing the labeling duration
`CI.size`	A number between 0 and 1 representing the size of the confidence interval

Details

The start of labeling for all samples should be the same experimental time point. The fit gets more precise with multiple samples from multiple labeling durations. Also a sample without 4sU (representing time 0) is useful.

The standard mass action kinetics model of gene expression arises from the following differential equation:

df/dt = s - d f(t)

This model assumes constant synthesis and degradation rates (but not necessarily that the system is in steady state at time 0). From the solution of this differential equation, it is straight forward to derive the expected abundance of old and new RNA at time t for given parameters s (synthesis rate), d (degradation rate) and f0=f(0) (the abundance at time 0). These equations are implemented in f.old.equi (old RNA assuming steady state gene expression, i.e. f0=s/d), f.old.nonequi (old RNA without assuming steady state gene expression) and f.new (new RNA; whether or not it is steady state does not matter).

This function primarily finds d such that the squared error between the observed values of old and new RNA and their corresponding functions is minimized in log space. For that to work, data has to be properly normalized, but this is independent on any steady state assumptions. The synthesis rate is computed (under the assumption of steady state) as s=f0 \cdot d

Value

A named list containing the model fit:

data: a data frame containing the observed value used for fitting
Synthesis: the synthesis rate (in U/h, where U is the unit of the slot)
Degradation: the degradation rate (in 1/h)
Half-life: the RNA half-life (in h, always equal to log(2)/degradation-rate
conf.lower: a vector containing the lower confidence bounds for Synthesis, Degradation and Half-life
conf.upper: a vector containing the lower confidence bounds for Synthesis, Degradation and Half-life
f0: The abundance at time 0 (in U)
logLik: the log likelihood of the model
rmse: the total root mean square error
adj.r.squared: adjusted R^2 of the linear model fit
total: the total sum of all new and old RNA values used for fitting
type: always "lm"

If Condition(data) is not NULL, the return value is a named list (named according to the levels of Condition(data)), each element containing such a structure.

Examples

sars <- ReadGRAND(system.file("extdata", "sars.tsv.gz", package = "grandR"),
                  design=c("Condition",Design$dur.4sU,Design$Replicate))
sars <- Normalize(sars)
FitKineticsGeneLogSpaceLinear(sars,"SRSF6")   # fit per condition

[Package grandR version 0.2.5 Index]