tandem {TANDEM}R Documentation

Fits a TANDEM model by performing a two-stage regression

Description

Fits a TANDEM model by performing a two-stage regression. In the first stage, all upstream features (x[,upstream]) are regressed on the output y. In the second stage, the downstream features (x[,!upstream]) are regressed on the residuals of the first stage. In both stages Elastic Net regression (as implemented in cv.glmnet() from the glmnet package) is used to perform the regression.

Usage

tandem(
  x,
  y,
  upstream,
  family = "gaussian",
  nfolds = 10,
  foldid = NULL,
  lambda_upstream = "lambda.1se",
  lambda_downstream = "lambda.1se",
  ...
)

Arguments

x

A feature matrix, where the rows correspond to samples and the columns to features.

y

A vector containing the response.

upstream

A boolean vector that indicates for each feature whether it's upstream (TRUE) or downstream (FALSE).

family

The family parameter that's passed to cv.glmnet(). Currently, only family='gaussian' is supported.

nfolds

Number of cross-validation folds (default is 10) used to determine the optimal lambda in cv.glmnet().

foldid

An optional vector indicating in which cross-validation fold each sample should be. Overrides nfolds when used.

lambda_upstream

For the first stage (using the upstream features), should glmnet use lambda.min or lambda.1se? Default is lambda.1se.

lambda_downstream

For the second stage (using the downstream features), should glmnet use lambda.min or lambda.1se? Default is lambda.1se.

...

Other parameters that are passed to cv.glmnet().

Value

A tandem-object.

Examples

# unpack example data
x = example_data$x
y = example_data$y
upstream = example_data$upstream

# fit a tandem model, determine the coefficients and create a prediction
fit = tandem(x, y, upstream, alpha=0.5)
beta = coef(fit)
y_hat = predict(fit, newx=x)

[Package TANDEM version 1.0.3 Index]