downsample {rliger} | R Documentation |
Downsample datasets
Description
This function mainly aims at downsampling datasets to a size suitable for plotting or expensive in-memmory calculation.
Users can balance the sample size of categories of interests with
balance
. Multi-variable specification to balance
is supported,
so that at most maxCells
cells will be sampled from each combination
of categories from the variables. For example, when two datasets are
presented and three clusters labeled across them, there would then be at most
2 \times 3 \times maxCells
cells being selected. Note that
"dataset"
will automatically be added as one variable when balancing
the downsampling. However, if users want to balance the downsampling solely
basing on dataset origin, users have to explicitly set balance =
"dataset"
.
Usage
downsample(
object,
balance = NULL,
maxCells = 1000,
useDatasets = NULL,
seed = 1,
returnIndex = FALSE,
...
)
Arguments
object |
liger object |
balance |
Character vector of categorical variable names in
|
maxCells |
Max number of cells to sample from the grouping based on
|
useDatasets |
Index selection of datasets to include Default
|
seed |
Random seed for reproducibility. Default |
returnIndex |
Logical, whether to only return the numeric index that can
subset the original object instead of a subset object. Default |
... |
Arguments passed to |
Value
By default, a subset of liger object
.
Alternatively when returnIndex = TRUE
, a numeric vector to be used
with the original object.
Examples
# Subsetting an object
pbmc <- downsample(pbmc)
# Creating a subsetting index
sampleIdx <- downsample(pbmcPlot, balance = "leiden_cluster",
maxCells = 10, returnIndex = TRUE)
plotClusterDimRed(pbmcPlot, cellIdx = sampleIdx)