subsetContigs {SQMtools} | R Documentation |
Select contigs
Description
Create a SQM object containing only the requested contigs, the ORFs contained in them and the bins that contain them.
Usage
subsetContigs(
SQM,
contigs,
trusted_functions_only = FALSE,
ignore_unclassified_functions = FALSE,
rescale_tpm = FALSE,
rescale_copy_number = FALSE
)
Arguments
SQM |
SQM object to be subsetted.
|
contigs |
character. Vector of contigs to be selected.
|
trusted_functions_only |
logical. If TRUE , only highly trusted functional annotations (best hit + best average) will be considered when generating aggregated function tables. If FALSE , best hit annotations will be used (default FALSE ).
|
ignore_unclassified_functions |
logical. If FALSE , ORFs with no functional classification will be aggregated together into an "Unclassified" category. If TRUE , they will be ignored (default FALSE ).
|
rescale_tpm |
logical. If TRUE , TPMs for KEGGs, COGs, and PFAMs will be recalculated (so that the TPMs in the subset actually add up to 1 million). Otherwise, per-function TPMs will be calculated by aggregating the TPMs of the ORFs annotated with that function, and will thus keep the scaling present in the parent object (default FALSE ).
|
rescale_copy_number |
logical. If TRUE , copy numbers with be recalculated using the RecA/RadA coverages in the subset. Otherwise, RecA/RadA coverages will be taken from the parent object. By default it is set to FALSE , which means that the returned copy numbers for each function will represent the average copy number of that function per genome in the parent object.
|
Value
SQM object containing only the selected contigs.
See Also
subsetORFs
Examples
data(Hadza)
# Which contigs have a GC content below 40?
lowGCcontigNames = rownames(Hadza$contigs$table[Hadza$contigs$table[,"GC perc"]<40,])
lowGCcontigs = subsetContigs(Hadza, lowGCcontigNames)
hist(lowGCcontigs$contigs$table[,"GC perc"])
[Package
SQMtools version 1.6.3
Index]