sortTrees {PhySortR}R Documentation

Sorts Phylogenetic Trees using Taxa Identifiers

Description

Reads phylogenetic trees from a directory and sorts them based on the presence of Exclusive and Non-Exclusive clades containing a set of given target leaves at a desired support value. Can interpret trees in both Newick and extended Newick format.

Usage

sortTrees(target.groups, min.support = 0, min.prop.target = 0.7,
  in.dir = ".", out.dir = "Sorted_Trees", mode = "l",
  clades.sorted = "E,NE", extension = ".tre", clade.exclusivity = 0.9)

Arguments

target.groups

a set of one or more terms that represent the target leaves whose membership will be tested in each clade during sorting. Multiple terms are to be separated by a comma ("Taxon1,Taxon2"). This process is case sensitive and uses strict string-matching, so the taxa identifiers must be unique i.e. "plantae" and "Viridiplantae" might not be appropriate as the first is a subset of the second.

min.support

the minimum support (i.e. between 0-1 or 0-100) of a clade (Default = 0). Support values missing from phylogenetic trees are interpreted as zero.

min.prop.target

the minimum proportion (between 0.0-1.0) of target leaves to be present in a clade out of the total target leaves in the tree (Default = 0.7).

in.dir

directory containing the phylogenetic trees to be sorted (Default = current working directory).

out.dir

directory to be created within in.dir for the trees identified during sorting. If out.dir is omitted the default of Sorted_Trees/ will be used.

mode

option to "m" (move), "c" (copy) or "l" (list) trees identified during sorting. In "l" mode (default) a list of the sorted trees is returned, in the "m" and "c" modes a list is returned and the identified trees are moved/copied to the out.dir.

clades.sorted

option to control if the function will sort for Exclusive ("E") and/or Non-Exclusive ("NE") clades. Specify both options by comma separation "E,NE" (Default). Exclusive clades are also sorted into a sub-group of All Exclusive trees.

extension

the file extension of the tree files to be analyzed (Default = ".tre").

clade.exclusivity

the minimum proportion (0.0 <= x < 1.0) of target leaves to interrupting leaves allowed in each non-exclusive clade (Default = 0.9).

Value

Will always return a list containing the names of the trees identified during sorting, irrespective of the mode argument.

Examples

 ### Load data ###
 extdata <- system.file("extdata", package="PhySortR")
 file.copy(dir(extdata, full.names = TRUE), ".")
 dir.create("Algae_Trees/")
 file.copy(dir(extdata, full.names = TRUE), "Algae_Trees/")
 
 ### Examples ###
 # (1) Sorting using 3 target terms, all other parameters default. 
 sortTrees(target.groups = "Rhodophyta,Viridiplantae")
 
 # The function will search in the users current working directory for files 
 # with the extension ".tre" and check them (using default min.support, 
 # min.prop.target and clade.exclusivity) for Exclusive, All Exclusive or 
 # Non-Exclusive clades. A list will be returned with the names of the trees 
 # identified during sorting. 
 
 
 
 # (2) Sorting with a target directory and an out directory specified.
 sortTrees(target.groups = "Rhodophyta,Viridiplantae",
   in.dir= "Algae_Trees/", 
   out.dir="Sorted_Trees_RVG/", 
   mode = "c")
   
 # The function will search in "Algae_Trees/" for files with the extension
 # ".tre" and check them (using default min.support, min.prop.target, 
 # clade.exclusivity) for Exclusive, All Exclusive or Non-Exclusive clades. 
 # The function will both (a) return a list of the trees identified during 
 # sorting and (b) copy the files into their respective subdirectories of
 # "Algae_Trees/Sorted_Trees_RVG/Exclusive/", 
 # "Algae_Trees/Sorted_Trees_RVG/Exclusive/All_Exclusive/" and 
 # "Algae_Trees/Sorted_Trees_RVG/Non_Exclusive/".
 
 
 
 # (3) Sorting with in/out directories, min.prop.target and min.support specified.
 sortTrees(target.groups = "Rhodophyta,Viridiplantae",
   min.prop.target = 0.8,
   min.support = 90,
   in.dir= "Algae_Trees/",
   out.dir="Sorted_Trees_RVG_95/",
   mode = "c",
   clades.sorted = "NE",
   clade.exclusivity = 0.95)
   
 # The function will search in "Algae_Trees/" for files with the 
 # extension ".tre" and check them for only Non-Exclusive clades. 
 # A clade will only be defined if it has support >= 90 and contains at least
 # 80% of the total target leaves in the tree. A Non-Exclusive clade must also
 # be composed of >= 95% target taxa (i.e. < 5% non-target taxa).
 # The function will (a) return a list of the trees identified during 
 # sorting and (b) copy the trees identified during sorting to the out 
 # directory "Algae_Trees/Sorted_Trees_RVG/Non_Exclusive/".
 
 ### Clean up ###
 unlink("Algae_Trees", recursive=TRUE)
 unlink("Sorted_Trees.log")
 unlink(dir(".", ".*.tre$"))

[Package PhySortR version 1.0.8 Index]