R: Estimate regulation from snapshot experiments

EstimateRegulation {grandR}

R Documentation

Estimate regulation from snapshot experiments

Description

Compute the posterior log2 fold change distributions of RNA synthesis and degradation

Usage

EstimateRegulation(
  data,
  name.prefix = "Regulation",
  contrasts,
  reference.columns = NULL,
  slot = DefaultSlot(data),
  time.labeling = Design$dur.4sU,
  time.experiment = NULL,
  ROPE.max.log2FC = 0.25,
  sample.f0.in.ss = TRUE,
  N = 10000,
  N.max = N * 10,
  CI.size = 0.95,
  seed = 1337,
  dispersion = NULL,
  sample.level = 2,
  correct.labeling = FALSE,
  verbose = FALSE
)

Arguments

`data`	the grandR object
`name.prefix`	the prefix for the new analysis name; a dot and the column names of the contrast matrix are appended; can be NULL (then only the contrast matrix names are used)
`contrasts`	contrast matrix that defines all pairwise comparisons, generated using GetContrasts
`reference.columns`	a reference matrix usually generated by FindReferences to define reference samples for each sample; can be NULL if all conditions are at steady state (see details)
`slot`	the data slot to take f0 and totals from
`time.labeling`	the column in the Coldata table denoting the labeling duration, or the numeric labeling duration itself
`time.experiment`	the column in the Coldata table denoting the experimental time point (can be NULL, see details)
`ROPE.max.log2FC`	the region of practical equivalence is [-ROPE.max.log2FC,ROPE.max.log2FC] in log2 fold change space
`sample.f0.in.ss`	whether or not to sample f0 under steady state conditions
`N`	the sample size
`N.max`	the maximal number of samples (necessary if old RNA > f0); if more are necessary, a warning is generated
`CI.size`	A number between 0 and 1 representing the size of the credible interval
`seed`	Seed for the random number generator
`dispersion`	overdispersion parameter for each gene; if NULL this is estimated from data
`sample.level`	Define how the NTR is sampled from the hierarchical Bayesian model (must be 0,1, or 2; see details)
`correct.labeling`	Labeling times have to be unique; usually execution is aborted, if this is not the case; if this is set to true, the median labeling time is assumed
`verbose`	Print status messages

Details

The kinetic parameters s and d are computed using TransformSnapshot. For that, the sample either must be in steady state (this is the case if defined in the reference.columns matrix), or if the levels at an earlier time point are known from separate samples, so called temporal reference samples. Thus, if s and d are estimated for a set of samples x_1,...,x_k (that must be from the same time point t), we need to find (i) the corresponding temporal reference samples from time t0, and (ii) the time difference between t and t0.

The temporal reference samples are identified by the reference.columns matrix. This is a square matrix of logicals, rows and columns correspond to all samples and TRUE indicates that the row sample is a temporal reference of the columns sample. This time point is defined by time.experiment. If time.experiment is NULL, then the labeling time of the A or B samples is used (e.g. useful if labeling was started concomitantly with the perturbation, and the steady state samples are unperturbed samples).

By default, the hierarchical Bayesian model is estimated. If sample.level = 0, the NTRs are sampled from a beta distribution that approximates the mixture of betas from the replicate samples. If sample.level = 1, only the first level from the hierarchical model is sampled (corresponding to the uncertainty of estimating the biological variability). If sample.level = 2, the first and second levels are estimated (corresponding to the full hierarchical model).

if N is set to 0, then no sampling from the posterior is performed, but the transformed MAP estimates are returned

Value

a new grandR object including a new analysis table. The columns of the new analysis table are

`"s.A"`	the posterior mean synthesis rate for sample A from the comparison
`"s.B"`	the posterior mean synthesis rate for sample B from the comparison
`"HL.A"`	the posterior mean RNA half-life for sample A from the comparison
`"HL.B"`	the posterior mean RNA half-life for sample B from the comparison
`"s.log2FC"`	the posterior mean synthesis rate log2 fold change
`"s.cred.lower"`	the lower CI boundary of the synthesis rate log2 fold change
`"s.cred.upper"`	the upper CI boundary of the synthesis rate log2 fold change
`"s.ROPE"`	the signed ROPE probability (negative means downregulation) for the synthesis rate fold change
`"HL.log2FC"`	the posterior mean half-life log2 fold change
`"HL.cred.lower"`	the lower CI boundary of the half-life log2 fold change
`"HL.cred.upper"`	the upper CI boundary of the half-life log2 fold change
`"HL.ROPE"`	the signed ROPE probability (negative means downregulation) for the half-life fold change

Examples

banp <- ReadGRAND(system.file("extdata", "BANP.tsv.gz", package = "grandR"),
          design=c("Cell","Experimental.time","Genotype",
                       Design$dur.4sU,Design$has.4sU,Design$Replicate))
contrasts <- GetContrasts(banp,contrast=c("Experimental.time.original","0h"),name.format="$A")
reference.columns <- FindReferences(banp,reference= Experimental.time==0)
banp <- EstimateRegulation(banp,"Regulation",
                             contrasts=contrasts,
                             reference.columns=reference.columns,
                             verbose=TRUE,
                             time.experiment = "Experimental.time",
                             N=0,               # don't sample in the example
                             dispersion=0.1)    # don't estimate dispersion in the example
head(GetAnalysisTable(banp))

[Package grandR version 0.2.5 Index]