classifier.fit {PEkit} | R Documentation |
Fit the supervised classifier under partition exchangeability
Description
Fits the model according to training data x, where x is assumed to follow the Poisson-Dirichlet distribution, and discrete labels y.
Usage
classifier.fit(x, y)
Arguments
x |
data vector, or matrix with rows as data points and columns as features. |
y |
training data label vector of length equal to the amount of rows in |
Details
This function is used to learn the model parameters from the
training data, and gather them into an object that is used by the
classification algorithms tMarLab()
and tSimLab()
. The parameters it learns
are the Maximum Likelihood Estimate of the \psi
of each feature within
each class in the training data. It also records the frequencies of the data
for each feature within each class as well. These are used in calculating the
predictive probability of each test data being in each of the classes.
Value
Returns an object used as training data objects for the classification
algorithms tMarLab()
and tSimLab()
.
If x
is multidimensional, each list described below is returned for each dimension.
Returns a list of classwise lists, each with components:
frequencies
: the frequencies of values in the class.
psi
: the Maximum Likelihood estimate of \psi
for the class.
Examples
## Create training data x and its class labels y from Poisson-Dirichlet distributions
## with different psis:
set.seed(111)
x1<-rPD(5000,10)
x2<-rPD(5000,100)
x<-c(x1,x2)
y1<-rep("1", 5000)
y2<-rep("2", 5000)
y<-c(y1,y2)
fit<-classifier.fit(x,y)
## With multidimensional x:
set.seed(111)
x1<-cbind(rPD(5000,10),rPD(5000,50))
x2<-cbind(rPD(5000,100),rPD(5000,500))
x<-rbind(x1,x2)
y1<-rep("1", 5000)
y2<-rep("2", 5000)
y<-c(y1,y2)
fit<-classifier.fit(x,y)