standardizeData {DAP}R Documentation

Divides the features matrix into two standardized submatrices

Description

Given matrix X with corresponding class labels in Y, the function column-centers X, divides it into two submatrices corresponding to each class, and scales the columns of each submatrix to have eucledean norm equal to one.

Usage

standardizeData(X, Y, center = TRUE)

Arguments

X

A n x p training dataset; n observations on the rows and p features on the columns.

Y

A n vector of training group labels, either 1 or 2.

center

A logical indicating whether X should be centered, the default is TRUE.

Value

A list of

X1

A n1 x p standardized matrix with observations from group 1.

X2

A n2 x p standardized matrix with observations from group 2.

coef1

Back-scaling coefficients for X1.

coef2

Back-scaling coefficients for X2.

Xmean

Column means of the matrix X before centering.

Examples

# An example for the function standardizeData

## Generate data
n_train = 50
n_test = 50
p = 100
mu1 = rep(0, p)
mu2 = rep(3, p)
Sigma1 = diag(p)
Sigma2 = 0.5* diag(p)

## Build training data
x1 = MASS::mvrnorm(n = n_train, mu = mu1, Sigma = Sigma1)
x2 = MASS::mvrnorm(n = n_train, mu = mu2, Sigma = Sigma2)
xtrain = rbind(x1, x2)
ytrain = c(rep(1, n_train), rep(2, n_train))

## Standardize data
out_s = standardizeData(xtrain, ytrain, center = FALSE)


[Package DAP version 1.0 Index]