R: Simultaneously predicted labels of the test data given the...

tSimLab {PEkit}

R Documentation

Simultaneously predicted labels of the test data given the training data classification.

Description

Classifies the test data x based on the training data object. All of the test data is used simultaneously to make the classification.

Usage

tSimLab(training, x)

Arguments

`training`	A training data object from the function `classifier.fit()`.
`x`	Test data vector or matrix with rows as data points and columns as features.

Details

The test data are first labeled with the marginal classifier. The simultaneous classifier then iterates over all test data, assigning each a label by finding the maximum predictive probability given the current classification structure of the test data as a whole. This is repeated until the classification structure converges after iterating over all data.

Value

A vector of predicted labels for test data x.

References

Amiryousefi A. Asymptotic supervised predictive classifiers under partition exchangeability. . 2021. https://arxiv.org/abs/2101.10950.

Corander, J., Cui, Y., Koski, T., and Siren, J.: Have I seen you before? Principles of Bayesian predictive classification revisited. Springer, Stat. Comput. 23, (2011), 59–73, (<doi: 10.1007/s11222-011-9291-7>).

Examples

## Create random samples x from Poisson-Dirichlet distributions with different
## psis, treating each sample as coming from a class of its own:
set.seed(111)
x1<-rPD(1050,10)
x2<-rPD(1050,1000)
test.ind1<-sample.int(1050,50) # Sample test datasets from the
test.ind2<-sample.int(1050,50) # original samples
x<-c(x1[-test.ind1],x2[-test.ind2])
## create training data labels:
y1<-rep("1", 1000)
y2<-rep("2", 1000)
y<-c(y1,y2)

## Test data t, with first half belonging to class "1", second have in "2":
t1<-x1[test.ind1]
t2<-x2[test.ind2]
t<-c(t1,t2)

fit<-classifier.fit(x,y)

## Run the classifier, which returns
tS<-tSimLab(fit, t)

##With multidimensional x:
set.seed(111)
x1<-cbind(rPD(500,1),rPD(500,5))
x2<-cbind(rPD(500,10),rPD(500,50))
test.ind1<-sample.int(500,50)
test.ind2<-sample.int(500,50)
x<-rbind(x1[-test.ind1,],x2[-test.ind2,])
y1<-rep("1", 450)
y2<-rep("2", 450)
y<-c(y1,y2)
fit<-classifier.fit(x,y)
t1<-x1[test.ind1,]
t2<-x2[test.ind2,]
t<-rbind(t1,t2)

tS<-tSimLab(fit, t)

[Package PEkit version 1.0.0.1000 Index]