reclassify {FateID} | R Documentation |
Reclassification of cells
Description
This function attempts to reassign additional cells in the dataset to one of the target clusters.
Usage
reclassify(
x,
y,
tar,
z = NULL,
clthr = 0.75,
nbfactor = 5,
use.dist = FALSE,
seed = NULL,
nbtree = NULL,
q = 0.9,
...
)
Arguments
x |
expression data frame with genes as rows and cells as columns. Gene IDs should be given as row names and cell IDs should be given as column names. This can be a reduced expression table only including the features (genes) to be used in the analysis. |
y |
clustering partition. A vector with an integer cluster number for each cell. The order of the cells has to be the same as for the columns of x. |
tar |
vector of integers representing target cluster numbers. Each element of |
z |
Matrix containing cell-to-cell distances to be used in the fate bias computation. Default is |
clthr |
real number between zero and one. This is the threshold for the fraction of random forest votes required to assign a cell not contained within the target clusters to one of these clusters. The value of this parameter should be sufficiently high to only reclassify cells with a high-confidence assignment. Default value is 0.9. |
nbfactor |
positive integer number. Determines the number of trees grown for each random forest. The number of trees is given by the number of columns of th training set multiplied by |
use.dist |
logical value. If |
seed |
integer seed for initialization. If equal to |
nbtree |
integer value. If given, it specifies the number of trees for each random forest explicitely. Default is |
q |
real value between zero and one. This number specifies a threshold used for feature selection based on importance sampling. A reduced expression table is generated containing only features with an importance larger than the q-quantile for at least one of the classes (i. e. target clusters). Default value is 0.75. |
... |
additional arguments to be passed to the low level function |
Details
The function uses random forest based supervised learning to assign cells not contained in the target clusters to one of these clusters. All cells not within any of the target clusters which receive a fraction of votes larger than clthr
for one of the target clusters will be reassigned to this cluster. Since this function is developed to reclassify cells only if they can be assigned with high confidence, a high value of clthr
(e. g. > 0.75) should be applied.
Value
A list with the following three components:
part |
A vector with the revised cluster assignment for each cell in the same order as in the input argument |
rf |
The random forest object generated for the reclassification, with enabled importance sampling (see randomForest). |
xf |
A filtered expression table with features extracted based on the important samples, only features with an importance larger than the q-quantile are for at least one of the classes are retained. |
Examples
x <- intestine$x
y <- intestine$y
tar <- c(6,9,13)
rc <- reclassify(x,y,tar,z=NULL,nbfactor=5,use.dist=FALSE,seed=NULL,nbtree=NULL,q=.9)