sig_estimate {sigminer} | R Documentation |
Estimate Signature Number
Description
Use NMF package to evaluate the optimal number of signatures.
This is used along with sig_extract.
Users should library(NMF)
firstly. If NMF objects are returned,
the result can be further visualized by NMF plot methods like
NMF::consensusmap()
and NMF::basismap()
.
sig_estimate()
shows comprehensive rank survey generated by
NMF package, sometimes
it is hard to consider all measures. show_sig_number_survey()
provides a
one or two y-axis visualization method to help users determine
the optimal signature number (showing both
stability ("cophenetic") and error (RSS) at default).
Users can also set custom measures to show.
show_sig_number_survey2()
is modified from NMF package to
better help users to explore survey of signature number.
Usage
sig_estimate(
nmf_matrix,
range = 2:5,
nrun = 10,
use_random = FALSE,
method = "brunet",
seed = 123456,
cores = 1,
keep_nmfObj = FALSE,
save_plots = FALSE,
plot_basename = file.path(tempdir(), "nmf"),
what = "all",
verbose = FALSE
)
show_sig_number_survey(
object,
x = "rank",
left_y = "cophenetic",
right_y = "rss",
left_name = left_y,
right_name = toupper(right_y),
left_color = "black",
right_color = "red",
left_shape = 16,
right_shape = 18,
shape_size = 4,
highlight = NULL
)
show_sig_number_survey2(
x,
y = NULL,
what = c("all", "cophenetic", "rss", "residuals", "dispersion", "evar", "sparseness",
"sparseness.basis", "sparseness.coef", "silhouette", "silhouette.coef",
"silhouette.basis", "silhouette.consensus"),
na.rm = FALSE,
xlab = "Total signatures",
ylab = "",
main = "Signature number survey using NMF package"
)
Arguments
nmf_matrix |
a |
range |
a |
nrun |
a |
use_random |
Should generate random data from input to test measurements. Default is |
method |
specification of the NMF algorithm. Use 'brunet' as default. Available methods for NMF decompositions are 'brunet', 'lee', 'ls-nmf', 'nsNMF', 'offset'. |
seed |
specification of the starting point or seeding method, which will compute a starting point, usually using data from the target matrix in order to provide a good guess. |
cores |
number of cpu cores to run NMF. |
keep_nmfObj |
default is |
save_plots |
if |
plot_basename |
when save plots, set custom basename for file path. |
what |
a character vector whose elements partially match one of the following item,
which correspond to the measures computed by |
verbose |
if |
object |
a |
x |
a |
left_y |
column name for left y axis. |
right_y |
column name for right y axis. |
left_name |
label name for left y axis. |
right_name |
label name for right y axis. |
left_color |
color for left axis. |
right_color |
color for right axis. |
left_shape , right_shape , shape_size |
shape setting. |
highlight |
a |
y |
for random simulation,
a |
na.rm |
single logical that specifies if the rank
for which the measures are NA values should be removed
from the graph or not (default to |
xlab |
x-axis label |
ylab |
y-axis label |
main |
main title |
Details
The most common approach is to choose the smallest rank for which cophenetic correlation coefficient starts decreasing (Used by this function). Another approach is to choose the rank for which the plot of the residual sum of squares (RSS) between the input matrix and its estimate shows an inflection point. More custom features please directly use NMF::nmfEstimateRank.
Value
sig_estimate: a
list
contains information of NMF run and rank survey.
show_sig_number_survey: a
ggplot
object
show_sig_number_survey2: a
ggplot
object
Author(s)
Shixiang Wang
References
Gaujoux, Renaud, and Cathal Seoighe. "A flexible R package for nonnegative matrix factorization." BMC bioinformatics 11.1 (2010): 367.
See Also
sig_extract for extracting signatures using NMF package, sig_auto_extract for extracting signatures using automatic relevance determination technique.
sig_estimate for estimating signature number for sig_extract, show_sig_number_survey2 for more visualization method.
Examples
load(system.file("extdata", "toy_copynumber_tally_W.RData",
package = "sigminer", mustWork = TRUE
))
library(NMF)
cn_estimate <- sig_estimate(cn_tally_W$nmf_matrix,
cores = 1, nrun = 5,
verbose = TRUE
)
p <- show_sig_number_survey2(cn_estimate$survey)
p
# Show two measures
show_sig_number_survey(cn_estimate)
# Show one measure
p1 <- show_sig_number_survey(cn_estimate, right_y = NULL)
p1
p2 <- add_h_arrow(p, x = 4.1, y = 0.953, label = "selected number")
p2
# Show data from a data.frame
p3 <- show_sig_number_survey(cn_estimate$survey)
p3
# Show other measures
head(cn_estimate$survey)
p4 <- show_sig_number_survey(cn_estimate$survey,
right_y = "dispersion",
right_name = "dispersion"
)
p4
p5 <- show_sig_number_survey(cn_estimate$survey,
right_y = "evar",
right_name = "evar"
)
p5