classifier.fit {PEkit}R Documentation

Fit the supervised classifier under partition exchangeability

Description

Fits the model according to training data x, where x is assumed to follow the Poisson-Dirichlet distribution, and discrete labels y.

Usage

classifier.fit(x, y)

Arguments

x

data vector, or matrix with rows as data points and columns as features.

y

training data label vector of length equal to the amount of rows in x.

Details

This function is used to learn the model parameters from the training data, and gather them into an object that is used by the classification algorithms tMarLab() and tSimLab(). The parameters it learns are the Maximum Likelihood Estimate of the \psi of each feature within each class in the training data. It also records the frequencies of the data for each feature within each class as well. These are used in calculating the predictive probability of each test data being in each of the classes.

Value

Returns an object used as training data objects for the classification algorithms tMarLab() and tSimLab().

If x is multidimensional, each list described below is returned for each dimension.

Returns a list of classwise lists, each with components:

frequencies: the frequencies of values in the class.

psi: the Maximum Likelihood estimate of \psi for the class.

Examples

## Create training data x and its class labels y from Poisson-Dirichlet distributions
## with different psis:
set.seed(111)
x1<-rPD(5000,10)
x2<-rPD(5000,100)
x<-c(x1,x2)
y1<-rep("1", 5000)
y2<-rep("2", 5000)
y<-c(y1,y2)
fit<-classifier.fit(x,y)

## With multidimensional x:
set.seed(111)
x1<-cbind(rPD(5000,10),rPD(5000,50))
x2<-cbind(rPD(5000,100),rPD(5000,500))
x<-rbind(x1,x2)
y1<-rep("1", 5000)
y2<-rep("2", 5000)
y<-c(y1,y2)
fit<-classifier.fit(x,y)

[Package PEkit version 1.0.0.1000 Index]