PreProcess {BVSNLP} | R Documentation |
Preprocessing the design matrix, preparing it for variable selection procedure
Description
This function preprocesses the design matrix by removing
columns that contain NA
's or are all zero. It also standardizes
non-binary columns to have mean zero and variance one. The user has the
choice of log transforming continuous covariates before scaling them.
Usage
PreProcess(X, logT = FALSE)
Arguments
X |
The |
logT |
A boolean variable determining if log transform should be done on continuous columns before scaling them. Note that those columns should not contain any zeros or negative values. |
Value
It returns a list having the following objects:
X |
The filtered design matrix which can be used in variable selection procedure. Binary columns are moved to the end of the design matrix. |
gnames |
Gene names read from the column names of the filtered design matrix. |
Author(s)
Amir Nikooienejad
Examples
### Constructing a synthetic design matrix for the purpose of preprocessing
### imposing columns with different scales
n <- 40
p1 <- 50
p2 <- 150
p <- p1 + p2
X1 <- matrix(rnorm(n*p1, 1, 2), ncol = p1)
X2 <- matrix(rnorm(n*p2), ncol = p2)
X <- cbind(X1, X2)
### putting NA elements in the matrix
X[3,85] <- NA
X[25,85] <- NA
X[35,43] <- NA
X[15,128] <- NA
colnames(X) <- paste("gene_",c(1:p),sep="")
### Running the function. Note the intercept column that is added as the
### first column in the "logistic" family
Xout <- PreProcess(X)
dim(Xout$X)[2] == (p + 1) ## 1 is added because intercept column is included
## This is FALSE because of the removal of columns with NA elements