coverage_normalization {SIPmg}R Documentation

Normalize feature coverages to estimate absolute abundance or relative coverage using MAG/contig coverage values with or without multiplying total DNA concentration of the fraction

Description

Normalize feature coverages to estimate absolute abundance or relative coverage using MAG/contig coverage values with or without multiplying total DNA concentration of the fraction

Usage

coverage_normalization(
  f_tibble,
  contig_coverage,
  sequencing_yield,
  fractions_df,
  approach = "relative_coverage"
)

Arguments

f_tibble

Can be either of (1) a tibble with first column "Feature" that contains bin IDs, and the rest of the columns represent samples with bins' coverage values. (2) a tibble as outputted by the program "checkm coverage" from the tool CheckM. Please check CheckM documentation - https://github.com/Ecogenomics/CheckM on the usage for "checkm coverage" program

contig_coverage

tibble with contig ID names ("Feature" column), sample columns with same sample names as in f_tibble containing coverage values of each contig, contig length in bp ("contig_length" column), and the MAG the contig is associated ("MAG" column) with same MAGs as in Feature column of f_tibble dataset.

sequencing_yield

tibble containing sample ID ("sample" column) with same sample names as in f_tibble and number of reads in bp recovered in that sample ("yield" column).

fractions_df

fractions data frame A fractions file with the following columns

  • Replicate: Depends on how many replicates the study has

  • Fractions: Typically in the range of 2-24

  • Buoyant_density: As calculated from the refractometer for each fraction and replicate

  • Isotope: "12C", "13C", "14N", "15N" etc.

  • DNA_concentration

  • Sample: In the format "'isotope'rep#fraction#". For instance, "12C_rep_1_fraction_1"

approach

Please choose the method for coverage normalization as "relative_coverage", "greenlon", "starr" to estimate only relative coverage without multiplying DNA concentration of fraction, or as per methods in Greenlon et al. - https://journals.asm.org/doi/full/10.1128/msystems.00417-22 or Starr et al. - https://journals.asm.org/doi/10.1128/mSphere.00085-21

Value

tibble containing normalized coverage in required format with MAG name as first column and the normalized coverage values in each sample as the rest of the columns.

Examples


data(f_tibble)

rel.cov = coverage_normalization(f_tibble)



[Package SIPmg version 1.4.1 Index]