pelora {supclust} | R Documentation |
Supervised Grouping of Predictor Variables
Description
Performs selection and supervised grouping of predictor variables in large (microarray gene expression) datasets, with an option for simultaneous classification. Works in a greedy forward strategy and optimizes the binomial log-likelihood, based on estimated conditional probabilities from penalized logistic regression analysis.
Usage
pelora(x, y, u = NULL, noc = 10, lambda = 1/32, flip = "pm",
standardize = TRUE, trace = 1)
Arguments
x |
Numeric matrix of explanatory variables ( |
y |
Numeric vector of length |
u |
Numeric matrix of additional (clinical) explanatory variables
( |
noc |
Integer, the number of clusters that should be searched for on the data. |
lambda |
Real, defaults to 1/32. Rescaled penalty parameter that
should be in |
flip |
Character string, describing a method how the |
standardize |
Logical, defaults to |
trace |
Integer >= 0; when positive, the output of the internal
loops is provided; |
Value
pelora
returns an object of class "pelora". The functions
print
and summary
are used to obtain an overview of the
variables (genes) that have been selected and the groups that have
been formed. The function plot
yields a two-dimensional
projection into the space of the first two group centroids that
pelora
found. The generic function fitted
returns
the fitted values, these are the cluster representatives. coef
returns the penalized logistic regression coefficients \theta_j
for each of the predictors. Finally, predict
is used for
classifying test data with Pelora's internal penalized logistic
regression classifier on the basis of the (gene) groups that have been
found.
An object of class "pelora" is a list containing:
genes |
A list of length |
values |
A numerical matrix with dimension |
y |
Numeric vector of length |
steps |
Numerical vector of length |
lambda |
The rescaled penalty parameter. |
noc |
The number of clusters that has been searched for on the data. |
px |
The number of columns (genes) in the |
flip |
The method that has been chosen for sign-flipping the
|
var.type |
A factor with |
crit |
A list of length |
signs |
Numerical vector of length |
samp.names |
The names of the samples (rows) in the
|
gene.names |
The names of the variables (columns) in the
|
call |
The function call. |
Author(s)
Marcel Dettling, dettling@stat.math.ethz.ch
References
Marcel Dettling (2003) Finding Predictive Gene Groups from Microarray Data, see https://stat.ethz.ch/~dettling/supervised.html
Marcel Dettling and Peter Bühlmann (2002). Supervised Clustering of Genes. Genome Biology, 3(12): research0069.1-0069.15, doi: 10.1186/gb-2002-3-12-research0069.
Marcel Dettling and Peter Bühlmann (2004). Finding Predictive Gene Groups from Microarray Data. Journal of Multivariate Analysis 90, 106–131, doi: 10.1016/j.jmva.2004.02.012
See Also
wilma
for another supervised clustering technique.
Examples
## Working with a "real" microarray dataset
data(leukemia, package="supclust")
## Generating random test data: 3 observations and 250 variables (genes)
set.seed(724)
xN <- matrix(rnorm(750), nrow = 3, ncol = 250)
## Fitting Pelora
fit <- pelora(leukemia.x, leukemia.y, noc = 3)
## Working with the output
fit
summary(fit)
plot(fit)
fitted(fit)
coef(fit)
## Fitted values and class probabilities for the training data
predict(fit, type = "cla")
predict(fit, type = "prob")
## Predicting fitted values and class labels for the random test data
predict(fit, newdata = xN)
predict(fit, newdata = xN, type = "cla", noc = c(1,2,3))
predict(fit, newdata = xN, type = "pro", noc = c(1,3))
## Fitting Pelora such that the first 70 variables (genes) are not grouped
fit <- pelora(leukemia.x[, -(1:70)], leukemia.y, leukemia.x[,1:70])
## Working with the output
fit
summary(fit)
plot(fit)
fitted(fit)
coef(fit)
## Fitted values and class probabilities for the training data
predict(fit, type = "cla")
predict(fit, type = "prob")
## Predicting fitted values and class labels for the random test data
predict(fit, newdata = xN[, -(1:70)], newclin = xN[, 1:70])
predict(fit, newdata = xN[, -(1:70)], newclin = xN[, 1:70], "cla", noc = 1:10)
predict(fit, newdata = xN[, -(1:70)], newclin = xN[, 1:70], type = "pro")