FLLat.PVE {FLLat}R Documentation

Choosing the Number of Features for the Fused Lasso Latent Feature Model

Description

Calculates the percentage of variation explained (PVE) for a range of values of JJ (the number of features) for the Fused Lasso Latent Feature (FLLat) model. Also plots the PVE against JJ, which can be used for choosing the value of JJ.

Usage

FLLat.PVE(Y, J.seq=seq(1,min(15,floor(ncol(Y)/2)),by=2), B=c("pc","rand"),
          lams=c("same","diff"), thresh=10^(-4), maxiter=100, maxiter.B=1,
          maxiter.T=1)

## S3 method for class 'PVE'
plot(x, xlab="Number of Features", ylab="PVE", ...)

Arguments

Y

A matrix of data from an aCGH experiment (usually in the form of log intensity ratios) or some other type of copy number data. Rows correspond to the probes and columns correspond to the samples.

J.seq

A vector of values of JJ (the number of features) for which to calculate the PVE. The default values are every second integer between 11 and smaller of either 1515 or the number of samples divided by 22.

B

The initial values for the features to use in the FLLat algorithm for each value of JJ. Can be one of "pc" (the first JJ principal components of Y) or "rand" (a random selection of JJ columns of Y). The default is "pc".

lams

The choice of whether to use the same values of the tuning parameters in the FLLat algorithm for each value of JJ ("same") or to calculate the optimal tuning parameters for each value of JJ ("diff"). When using the same values, the optimal tuning parameters are calculated once for the default value of JJ in the FLLat algorithm. The default is "same".

thresh

The threshold for determining when the solutions have converged in the FLLat algorithm. The default is 10410^{-4}.

maxiter

The maximum number of iterations for the outer loop of the FLLat algorithm. The default is 100100.

maxiter.B

The maximum number of iterations for the inner loop of the FLLat algorithm for estimating the features BB. The default is 11. Increasing this may decrease the number of iterations for the outer loop but may still increase total run time.

maxiter.T

The maximum number of iterations for the inner loop of the FLLat algorithm for estimating the weights Θ\Theta. The default is 11. Increasing this may decrease the number of iterations for the outer loop but may still increase total run time.

x

An object of class PVE, as returned by FLLat.PVE.

xlab

The title for the xx-axis of the PVE plot.

ylab

The title for the yy-axis of the PVE plot.

...

Further graphical parameters.

Details

This function calculates the PVE for each value of JJ as specified by J.seq. The PVE is defined to be:

PVE=1RSSTSSPVE = 1 - \frac{RSS}{TSS}

where RSS and TSS denote the residual sum of squares and the total sum of squares, respectively. For each value of JJ, the PVE is calculated by fitting the FLLat model with that value of JJ.

There are two choices for how the tuning parameters are chosen when fitting the FLLat model for each value of JJ. The first choice, given by lams="same", applies the FLLat.BIC function just once for the default value of JJ. The resulting optimal tuning parameters are then used for all values of JJ in J.seq. The second choice, given by lams="diff", applies the FLLat.BIC function for each value of JJ in J.seq. Although this second choice will give a more accurate measure of the PVE, it will take much longer to run than the first choice.

When the PVE is plotted against JJ, as JJ increases the PVE will begin to plateau after a certain point, indicating that additional features are not improving the model. Therefore, the value of JJ to use in the FLLat algorithm can be chosen as the point at which the PVE plot begins to plateau.

For more details, please see Nowak and others (2011) and the package vignette.

Value

An object of class PVE with components:

PVEs

The PVE for each value of JJ in J.seq.

J.seq

The sequence of JJ values used.

There is a plot method for PVE objects.

Author(s)

Gen Nowak gen.nowak@gmail.com, Trevor Hastie, Jonathan R. Pollack, Robert Tibshirani and Nicholas Johnson.

References

G. Nowak, T. Hastie, J. R. Pollack and R. Tibshirani. A Fused Lasso Latent Feature Model for Analyzing Multi-Sample aCGH Data. Biostatistics, 2011, doi: 10.1093/biostatistics/kxr012

See Also

FLLat, FLLat.BIC

Examples

## Load simulated aCGH data.
data(simaCGH)

## Generate PVEs for J ranging from 1 to the number of samples divided by 2.
result.pve <- FLLat.PVE(simaCGH,J.seq=1:(ncol(simaCGH)/2))

## Generate PVE plot.
plot(result.pve)

[Package FLLat version 1.2-1 Index]