LDA.boost {GUEST} | R Documentation |
Implementation of the linear discriminant function for multi-label classification.
Description
This function applies the linear discriminant function to do classification for multi-label responses. The precision matrix, or the inverse of the covariance matrix, in the linear discriminant function can be estimated by w
in the function boost.graph
. In addition, error-prone covariates in the linear discriminant function are addressed by the regression calibration.
Usage
LDA.boost(data, resp, theta, sigma_e = 0.6,q = 0.8,lambda = 1, pi = 0.5)
Arguments
data |
An n (observations) times p (variables) matrix of random variables, whose distributions can be continuous, discrete, or mixed. |
resp |
An n-dimensional vector of categorical random variables, which is the response in the data. |
theta |
The estimator of the precision matrix. |
sigma_e |
The common value in the diagonal covariance matrix of the error for the classical measurement error model when |
q |
The common value used to characterize misclassification for binary random variables. The default value is 0.8. |
lambda |
The parameter of the Poisson distribution, which is used to characterize error-prone count random variables. The default value is 1. |
pi |
The probability in the Binomial distribution, which is used to characterize error-prone count random variables. The default value is 0.5. |
Details
The linear discriminant function used is as follow:
\code{score}_{i,j} = \log (\pi _i) - 0.5\ \mu_{i}^\top\ \code{theta}\ \mu _{i} + \code{data}_{j}^\top\ \code{theta}\ \mu_{i},
for the class i = 1, \cdots, I
with I
being the number of classes in the dataset and subject j = 1, \cdots, n
, where \pi _i
is the proportion of subjects in the class i
, \code{data}_{j}
is the vector of covariates for the subject j
, \code{theta}
is the precision matrix of the covariates, and \mu_{i}
is the empirical mean vector of the random variables in the class i
.
Value
score |
The value of the linear discriminant function (see details) with the estimator of the precision matrix accommodated. |
class |
The result of predicted class for subjects. |
Author(s)
Hui-Shan Tsao and Li-Pang Chen
Maintainer: Hui-Shan Tsao n410412@gmail.com
References
Hui-Shan Tsao (2024). Estimation of Ultrahigh-Dimensional Graphical Models and Its Application to Dsicriminant Analysis. Master Thesis supervised by Li-Pang Chen, National Chengchi University.
Examples
data(MedulloblastomaData)
X <- t(MedulloblastomaData[2:655,]) #covariates
Y <- MedulloblastomaData[1,] #response
X <- matrix(as.numeric(X),nrow=23)
p <- ncol(X)
n <- nrow(X)
#standarization
X_new=data.frame()
for (i in 1:p){
X_new[1:n,i]=(X[,i]-rep(mean(X[,i]),n))/sd(X[,i])
}
X_new=matrix(unlist(X_new),nrow = n)
#estimate graphical model
result <- boost.graph(data = X_new, thre = 0.2, ite1 = 3, ite2 = 0, ite3 = 0, rep = 1)
theta.hat <- result$w
theta.hat[which(theta.hat<0.8)]=0 #keep the highly dependent pairs
#predict
pre <- LDA.boost(data = X_new, resp = Y, theta = theta.hat)
estimated_Y <- pre$class