PreProcess {BVSNLP} | R Documentation |
This function preprocesses the design matrix by removing
columns that contain NA
's or are all zero. It also standardizes
non-binary columns to have mean zero and variance one. The user has the
choice of log transforming continuous covariates before scaling them.
PreProcess(X, logT = FALSE)
X |
The |
logT |
A boolean variable determining if log transform should be done on continuous columns before scaling them. Note that those columns should not contain any zeros or negative values. |
It returns a list having the following objects:
X |
The filtered design matrix which can be used in variable selection procedure. Binary columns are moved to the end of the design matrix. |
gnames |
Gene names read from the column names of the filtered design matrix. |
Amir Nikooienejad
### Constructing a synthetic design matrix for the purpose of preprocessing ### imposing columns with different scales n <- 40 p1 <- 50 p2 <- 150 p <- p1 + p2 X1 <- matrix(rnorm(n*p1, 1, 2), ncol = p1) X2 <- matrix(rnorm(n*p2), ncol = p2) X <- cbind(X1, X2) ### putting NA elements in the matrix X[3,85] <- NA X[25,85] <- NA X[35,43] <- NA X[15,128] <- NA colnames(X) <- paste("gene_",c(1:p),sep="") ### Running the function. Note the intercept column that is added as the ### first column in the "logistic" family Xout <- PreProcess(X) dim(Xout$X)[2] == (p + 1) ## 1 is added because intercept column is included ## This is FALSE because of the removal of columns with NA elements