PreProcess {BVSNLP} | R Documentation |

## Preprocessing the design matrix, preparing it for variable selection procedure

### Description

This function preprocesses the design matrix by removing
columns that contain `NA`

's or are all zero. It also standardizes
non-binary columns to have mean zero and variance one. The user has the
choice of log transforming continuous covariates before scaling them.

### Usage

```
PreProcess(X, logT = FALSE)
```

### Arguments

`X` |
The |

`logT` |
A boolean variable determining if log transform should be done on continuous columns before scaling them. Note that those columns should not contain any zeros or negative values. |

### Value

It returns a list having the following objects:

`X` |
The filtered design matrix which can be used in variable selection procedure. Binary columns are moved to the end of the design matrix. |

`gnames` |
Gene names read from the column names of the filtered design matrix. |

### Author(s)

Amir Nikooienejad

### Examples

```
### Constructing a synthetic design matrix for the purpose of preprocessing
### imposing columns with different scales
n <- 40
p1 <- 50
p2 <- 150
p <- p1 + p2
X1 <- matrix(rnorm(n*p1, 1, 2), ncol = p1)
X2 <- matrix(rnorm(n*p2), ncol = p2)
X <- cbind(X1, X2)
### putting NA elements in the matrix
X[3,85] <- NA
X[25,85] <- NA
X[35,43] <- NA
X[15,128] <- NA
colnames(X) <- paste("gene_",c(1:p),sep="")
### Running the function. Note the intercept column that is added as the
### first column in the "logistic" family
Xout <- PreProcess(X)
dim(Xout$X)[2] == (p + 1) ## 1 is added because intercept column is included
## This is FALSE because of the removal of columns with NA elements
```

*BVSNLP*version 1.1.9 Index]