R: Recluster sequences of an object of class 'physeq' or a list...

asv2otu {MiscMetabar}

R Documentation

Recluster sequences of an object of class `physeq` or a list of DNA sequences

Description

Usage

asv2otu(
  physeq = NULL,
  dna_seq = NULL,
  nproc = 1,
  method = "clusterize",
  id = 0.97,
  vsearchpath = "vsearch",
  tax_adjust = 0,
  vsearch_cluster_method = "--cluster_size",
  vsearch_args = "--strand both",
  keep_temporary_files = FALSE,
  swarmpath = "swarm",
  d = 1,
  swarm_args = "--fastidious",
  method_clusterize = "overlap",
  ...
)

Arguments

`physeq`	(required): a `phyloseq-class` object obtained using the `phyloseq` package.
`dna_seq`	You may directly use a character vector of DNA sequences in place of physeq args. When physeq is set, dna sequences take the value of `physeq@refseq`
`nproc`	(default: 1) Set to number of cpus/processors to use for the clustering
`method`	(default: clusterize) Set the clustering method. `clusterize` use the `DECIPHER::Clusterize()` fonction, `vsearch` use the vsearch software (https://github.com/torognes/vsearch) with arguments `--cluster_size` by default (see args `vsearch_cluster_method`) and `⁠-strand both⁠` (see args `vsearch_args`) `swarm` use the swarm
`id`	(default: 0.97) level of identity to cluster
`vsearchpath`	(default: vsearch) path to vsearch
`tax_adjust`	(Default 0) See the man page of `merge_taxa_vec()` for more details. To conserved the taxonomic rank of the most abundant ASV, set tax_adjust to 0 (default). For the moment only tax_adjust = 0 is robust
`vsearch_cluster_method`	(default: "–cluster_size) See other possible methods in the vsearch manual (e.g. `--cluster_size` or `--cluster_smallmem`) `--cluster_fast` : Clusterize the fasta sequences in filename, automatically sort by decreasing sequence length beforehand. `--cluster_size` : Clusterize the fasta sequences in filename, automatically sort by decreasing sequence abundance beforehand. `--cluster_smallmem` : Clusterize the fasta sequences in filename without automatically modifying their order beforehand. Sequence are expected to be sorted by decreasing sequence length, unless –usersort is used. In that case you may set `vsearch_args` to vsearch_args = "–strand both –usersort"
`vsearch_args`	(default : "–strand both") a one length character element defining other parameters to passed on to vsearch.
`keep_temporary_files`	(logical, default: FALSE) Do we keep temporary files temp.fasta (refseq in fasta or dna_seq sequences) cluster.fasta (centroid if method = "vsearch") temp.uc (clusters if method = "vsearch")
`swarmpath`	(default: swarm) path to swarm
`d`	(default: 1) maximum number of differences allowed between two amplicons, meaning that two amplicons will be grouped if they have `d` (or less) differences
`swarm_args`	(default : "–fastidious") a one length character element defining other parameters to passed on to swarm See other possible methods in the SWARM pdf manual
`method_clusterize`	(default "overlap") the method for the `DECIPHER::Clusterize()` method
`...`	Others arguments passed on to `DECIPHER::Clusterize()`

Details

This function use the merge_taxa_vec function to merge taxa into clusters. By default tax_adjust = 0. See the man page of merge_taxa_vec().

Value

A new object of class physeq or a list of cluster if dna_seq args was used.

Author(s)

Adrien Taudière

References

VSEARCH can be downloaded from https://github.com/torognes/vsearch. More information in the associated publication https://pubmed.ncbi.nlm.nih.gov/27781170.

Examples

if (requireNamespace("DECIPHER")) {
  asv2otu(data_fungi_mini)
}

if (requireNamespace("DECIPHER")) {
  asv2otu(data_fungi_mini, method_clusterize = "longest")

  if (MiscMetabar::is_swarm_installed()) {
    d_swarm <- asv2otu(data_fungi_mini, method = "swarm")
  }
  if (MiscMetabar::is_vsearch_installed()) {
    d_vs <- asv2otu(data_fungi_mini, method = "vsearch")
  }
}

[Package MiscMetabar version 0.9.1 Index]

Recluster sequences of an object of class physeq or a list of DNA sequences