LDA.boost {GUEST}R Documentation

Implementation of the linear discriminant function for multi-label classification.

Description

This function applies the linear discriminant function to do classification for multi-label responses. The precision matrix, or the inverse of the covariance matrix, in the linear discriminant function can be estimated by w in the function boost.graph. In addition, error-prone covariates in the linear discriminant function are addressed by the regression calibration.

Usage

LDA.boost(data, resp, theta, sigma_e = 0.6,q = 0.8,lambda = 1, pi = 0.5)

Arguments

data

An n (observations) times p (variables) matrix of random variables, whose distributions can be continuous, discrete, or mixed.

resp

An n-dimensional vector of categorical random variables, which is the response in the data.

theta

The estimator of the precision matrix.

sigma_e

The common value in the diagonal covariance matrix of the error for the classical measurement error model when data are continuous. The default value is 0.6.

q

The common value used to characterize misclassification for binary random variables. The default value is 0.8.

lambda

The parameter of the Poisson distribution, which is used to characterize error-prone count random variables. The default value is 1.

pi

The probability in the Binomial distribution, which is used to characterize error-prone count random variables. The default value is 0.5.

Details

The linear discriminant function used is as follow:

\code{score}_{i,j} = \log (\pi _i) - 0.5\ \mu_{i}^\top\ \code{theta}\ \mu _{i} + \code{data}_{j}^\top\ \code{theta}\ \mu_{i},


for the class i = 1, \cdots, I with I being the number of classes in the dataset and subject j = 1, \cdots, n, where \pi _i is the proportion of subjects in the class i, \code{data}_{j} is the vector of covariates for the subject j, \code{theta} is the precision matrix of the covariates, and \mu_{i} is the empirical mean vector of the random variables in the class i.

Value

score

The value of the linear discriminant function (see details) with the estimator of the precision matrix accommodated.

class

The result of predicted class for subjects.

Author(s)

Hui-Shan Tsao and Li-Pang Chen
Maintainer: Hui-Shan Tsao n410412@gmail.com

References

Hui-Shan Tsao (2024). Estimation of Ultrahigh-Dimensional Graphical Models and Its Application to Dsicriminant Analysis. Master Thesis supervised by Li-Pang Chen, National Chengchi University.

Examples

data(MedulloblastomaData)

X <- t(MedulloblastomaData[2:655,]) #covariates
Y <- MedulloblastomaData[1,] #response

X <- matrix(as.numeric(X),nrow=23)

p <- ncol(X)
n <- nrow(X)

#standarization
X_new=data.frame()
for (i in 1:p){
 X_new[1:n,i]=(X[,i]-rep(mean(X[,i]),n))/sd(X[,i])
}
X_new=matrix(unlist(X_new),nrow = n)


#estimate graphical model
result <- boost.graph(data = X_new, thre = 0.2, ite1 = 3, ite2 = 0, ite3 = 0, rep = 1)
theta.hat <- result$w

theta.hat[which(theta.hat<0.8)]=0 #keep the highly dependent pairs

#predict
pre <- LDA.boost(data = X_new, resp = Y, theta = theta.hat)
estimated_Y <- pre$class

[Package GUEST version 0.2.0 Index]