sample_n_obs {metacoder} | R Documentation |
Sample n observations from [taxmap()]
Description
Randomly sample some number of observations from a [taxmap()] object. Weights can be specified for observations or the taxa they are classified by. Any variable name that appears in [all_names()] can be used as if it was a vector on its own. See [dplyr::sample_n()] for the inspiration for this function. Calling the function using the 'obj$sample_n_obs(...)' style edits "obj" in place, unlike most R functions. However, calling the function using the ‘sample_n_obs(obj, ...)' imitates R’s traditional copy-on-modify semantics, so "obj" would not be changed; instead a changed version would be returned, like most R functions.
obj$sample_n_obs(data, size, replace = FALSE, taxon_weight = NULL, obs_weight = NULL, use_supertaxa = TRUE, collapse_func = mean, ...) sample_n_obs(obj, data, size, replace = FALSE, taxon_weight = NULL, obs_weight = NULL, use_supertaxa = TRUE, collapse_func = mean, ...)
Arguments
obj |
([taxmap()]) The object to sample from. |
data |
Dataset names, indexes, or a logical vector that indicates which datasets in 'obj$data' to sample. If multiple datasets are sampled at once, then they must be the same length. |
size |
('numeric' of length 1) The number of observations to sample. |
replace |
('logical' of length 1) If 'TRUE', sample with replacement. |
taxon_weight |
('numeric') Non-negative sampling weights of each taxon. If 'use_supertaxa' is 'TRUE', the weights for each taxon in an observation's classification are supplied to 'collapse_func' to get the observation weight. If 'obs_weight' is also specified, the two weights are multiplied (after 'taxon_weight' for each observation is calculated). |
obs_weight |
('numeric') Sampling weights of each observation. If 'taxon_weight' is also specified, the two weights are multiplied (after 'taxon_weight' for each observation is calculated). |
use_supertaxa |
('logical' or 'numeric' of length 1) Affects how the 'taxon_weight' is used. If 'TRUE', the weights for each taxon in an observation's classification are multiplied to get the observation weight. Otherwise, just the taxonomic level the observation is assign to it considered. If 'TRUE', use all supertaxa. Positive numbers indicate the number of ranks above each taxon to use. '0' is equivalent to 'FALSE'. Negative numbers are equivalent to 'TRUE'. |
collapse_func |
('function' of length 1) If 'taxon_weight' option is used and 'supertaxa' is 'TRUE', the weights for each taxon in an observation's classification are supplied to 'collapse_func' to get the observation weight. This function should take numeric vector and return a single number. |
... |
Additional options are passed to [filter_obs()]. |
target |
DEPRECIATED. use "data" instead. |
Value
An object of type [taxmap()]
See Also
Other taxmap manipulation functions:
arrange_obs()
,
arrange_taxa()
,
filter_obs()
,
filter_taxa()
,
mutate_obs()
,
sample_frac_obs()
,
sample_frac_taxa()
,
sample_n_taxa()
,
select_obs()
,
transmute_obs()
Examples
# Sample 2 rows without replacement
sample_n_obs(ex_taxmap, "info", 2)
sample_n_obs(ex_taxmap, "foods", 2)
# Sample with replacement
sample_n_obs(ex_taxmap, "info", 10, replace = TRUE)
# Sample some rows for often then others
sample_n_obs(ex_taxmap, "info", 3, obs_weight = n_legs)
# Sample multiple datasets at once
sample_n_obs(ex_taxmap, c("info", "phylopic_ids", "foods"), 3)