wilma {supclust} | R Documentation |
Supervised Clustering of Predictor Variables
Description
Performs supervised clustering of predictor variables for large (microarray gene expression) datasets. Works in a greedy forward strategy and optimizes a combination of the Wilcoxon and Margin statistics for finding the clusters.
Usage
wilma(x, y, noc, genes = NULL, flip = TRUE, once.per.clust = FALSE, trace = 0)
Arguments
x |
Numeric matrix of explanatory variables ( |
y |
Numeric vector of length |
noc |
Integer, the number of clusters that should be searched for on the data. |
genes |
Defaults to |
flip |
Logical, defaults to |
once.per.clust |
Logical, defaults to |
trace |
Integer >= 0; when positive, the output of the internal
loops is provided; |
Value
wilma
returns an object of class "wilma". The functions
print
and summary
are used to obtain an overview of the
clusters that have been found. The function plot
yields a
two-dimensional projection into the space of the first two clusters
that wilma
found. The generic function fitted
returns
the fitted values, these are the cluster representatives. Finally,
predict
is used for classifying test data on the basis of
Wilma's cluster with either the nearest-neighbor-rule, diagonal linear
discriminant analysis, logistic regression or aggregated trees.
An object of class "wilma" is a list containing:
clist |
A list of length |
steps |
Numerical vector of length |
y |
Numeric vector of length |
x.means |
A list of length |
noc |
Integer, the number of clusters that has been searched for on the data. |
signs |
Numerical vector of length |
Author(s)
Marcel Dettling, dettling@stat.math.ethz.ch
References
Marcel Dettling (2002) Supervised Clustering of Genes, see https://stat.ethz.ch/~dettling/supercluster.html
Marcel Dettling and Peter Bühlmann (2002). Supervised Clustering of Genes. Genome Biology, 3(12): research0069.1-0069.15, doi: 10.1186/gb-2002-3-12-research0069 .
Marcel Dettling and Peter Bühlmann (2004). Finding Predictive Gene Groups from Microarray Data. Journal of Multivariate Analysis 90, 106–131, doi: 10.1016/j.jmva.2004.02.012 .
See Also
score
, margin
, and for a newer
methodology, pelora
.
Examples
## Working with a "real" microarray dataset
data(leukemia, package="supclust")
## Generating random test data: 3 observations and 250 variables (genes)
set.seed(724)
xN <- matrix(rnorm(750), nrow = 3, ncol = 250)
## Fitting Wilma
fit <- wilma(leukemia.x, leukemia.y, noc = 3, trace = 1)
## Working with the output
fit
summary(fit)
plot(fit)
fitted(fit)
## Fitted values and class predictions for the training data
predict(fit, type = "cla")
predict(fit, type = "fitt")
## Predicting fitted values and class labels for test data
predict(fit, newdata = xN)
predict(fit, newdata = xN, type = "cla", classifier = "nnr", noc = c(1,2,3))
predict(fit, newdata = xN, type = "cla", classifier = "dlda", noc = c(1,3))
predict(fit, newdata = xN, type = "cla", classifier = "logreg")
predict(fit, newdata = xN, type = "cla", classifier = "aggtrees")