PlotLossvsLatentFactors {jrSiCKLSNMF} | R Documentation |
Create plots to help determine the number of latent factors
Description
Generate plots of the lowest achieved loss after a pre-specified number of iterations (default 100) for each latent factor (defaults to 2:20). This operates similarly to a scree plot, so please select a number of latent factors that corresponds to the elbow of the plot. This method is not appropriate for larger sets of data (more than 1000 cells)
Usage
PlotLossvsLatentFactors(
SickleJr,
rounds = 100,
differr = 1e-04,
d_vector = c(2:20),
parallel = FALSE,
nCores = detectCores() - 1,
subsampsize = NULL,
minibatch = FALSE,
random = FALSE,
random_W_updates = FALSE,
seed = NULL,
batchsize = -1,
lossonsubset = FALSE,
losssubsetsize = dim(SickleJr@count.matrices[[1]])[2]
)
Arguments
SickleJr |
An object of class SickleJr |
rounds |
Number of rounds to use: defaults to 100; this process is time consuming, so a high number of rounds is not recommended |
differr |
Tolerance for the percentage update in the likelihood: for these plots,
this defaults to |
d_vector |
Vector of |
parallel |
Boolean indicating whether to use parallel computation |
nCores |
Number of desired cores; defaults to the number of cores of the current machine minus 1 for convenience |
subsampsize |
Size of the random subsample (defaults to |
minibatch |
Boolean indicating whether to use the mini-batch algorithm: default is |
random |
Boolean indicating whether to use random initialization to generate the |
random_W_updates |
Boolean parameter for mini-batch algorithm; if |
seed |
Number representing the random seed |
batchsize |
Desired batch size; do not use if using a subsample |
lossonsubset |
Boolean indicating whether to calculate the loss on a subset rather than the full dataset; speeds up computation for larger datasets |
losssubsetsize |
Number of cells to use for the loss subset; default is total number of cells |
Value
An object of class SickleJr with a list of initialized \mathbf{W}^v
matrices and an \mathbf{H}
matrix
for each latent factor d\in\{1,...,D\}
added to the WHinitials
slot, a data frame holding relevant
values for plotting the elbow plot added to the latent.factor.elbow.values
slot, diagnostic plots of the loss vs. the number of latent factors added to the plots
slot, and the cell indices used to calculate the loss on the subsample added to the lossCalcSubSample
slot
References
Wickham H (2016). ggplot2: Elegant Graphics for Data Analysis, 2 edition. Springer International Publishing, Cham, Switzerland. ISBN 978-3-319-24277-4, doi:10.1007/978-3-319-24277-4, https://ggplot2.tidyverse.org/.
Examples
SimSickleJrSmall@latent.factor.elbow.values<-data.frame(NULL,NULL)
SimSickleJrSmall<-PlotLossvsLatentFactors(SimSickleJrSmall,d_vector=c(2:5),
rounds=5,parallel=FALSE)
#Next, we commute 2 of these in parallel.
## Not run:
SimSickleJrSmall<-PlotLossvsLatentFactors(SimSickleJrSmall,
d_vector=c(6:7),rounds=5,parallel=TRUE,nCores=2)
## End(Not run)