multiclass routines {regtools} | R Documentation |
Classification with More Than 2 Classes
Description
Tools for multiclass classification, parametric and nonparametric.
Usage
avalogtrn(trnxy,yname)
ovaknntrn(trnxy,yname,k,xval=FALSE)
avalogpred()
classadjust(econdprobs,wrongprob1,trueprob1)
boundaryplot(y01,x,regests,pairs=combn(ncol(x),2),pchvals=2+y01,cex=0.5,band=0.10)
Arguments
pchvals |
Point size in base-R graphics. |
trnxy |
Data matrix, Y last. |
xval |
If TRUE, use leaving-one-out method. |
y01 |
Y vector (1s and 0s). |
regests |
Estimated regression function values. |
x |
X data frame or matrix. |
pairs |
Two-row matrix, column i of which is a pair of predictor variables to graph. |
cex |
Symbol size for plotting. |
band |
If |
yname |
Name of the Y column. |
k |
Number of nearest neighbors. |
econdprobs |
Estimated conditional class probabilities, given the predictors. |
wrongprob1 |
Incorrect, data-provenanced, unconditional P(Y = 1). |
trueprob1 |
Correct unconditional P(Y = 1). |
Details
These functions aid classification in the multiclass setting.
The function boundaryplot
serves as a visualization technique,
for the two-class setting. It draws the boundary between predicted Y =
1 and predicted Y = 0 data points in 2-dimensional feature space, as
determined by the argument regests
. Used to visually assess
goodness of fit, typically running this function twice, say one for
glm
then for kNN
. If there is much discrepancy and the
analyst wishes to still use glm(), he/she may wish to add polynomial
terms.
The functions not listed above are largely deprecated, e.g. in favor of
qeLogit
and the other qe
-series functions.
Author(s)
Norm Matloff
Examples
## Not run:
data(oliveoils)
oo <- oliveoils[,-1]
# toy example
set.seed(9999)
x <- runif(25)
y <- sample(0:2,25,replace=TRUE)
xd <- preprocessx(x,2,xval=FALSE)
kout <- ovaknntrn(y,xd,m=3,k=2)
kout$regest # row 2: 0.0,0.5,0.5
predict(kout,predpts=matrix(c(0.81,0.55,0.15),ncol=1)) # 0,2,0or2
yd <- factorToDummies(as.factor(y),'y',FALSE)
kNN(x,yd,c(0.81,0.55,0.15),2) # predicts 0, 1or2, 2
data(peDumms) # prog/engr data
ped <- peDumms[,-33]
ped <- as.matrix(ped)
x <- ped[,-(23:28)]
y <- ped[,23:28]
knnout <- kNN(x,y,x,25,leave1out=TRUE)
truey <- apply(y,1,which.max) - 1
mean(knnout$ypreds == truey) # about 0.37
xd <- preprocessx(x,25,xval=TRUE)
kout <- knnest(y,xd,25)
preds <- predict(kout,predpts=x)
hats <- apply(preds,1,which.max) - 1
mean(yhats == truey) # about 0.37
data(peFactors)
# discard the lower educ-level cases, which are rare
edu <- peFactors$educ
numedu <- as.numeric(edu)
idxs <- numedu >= 12
pef <- peFactors[idxs,]
numedu <- numedu[idxs]
pef$educ <- as.factor(numedu)
pef1 <- pef[,c(1,3,5,7:9)]
# ovalog
ovaout <- ovalogtrn(pef1,"occ")
preds <- predict(ovaout,predpts=pef1[,-3])
mean(preds == factorTo012etc(pef1$occ)) # about 0.39
# avalog
avaout <- avalogtrn(pef1,"occ")
preds <- predict(avaout,predpts=pef1[,-3])
mean(preds == factorTo012etc(pef1$occ)) # about 0.39
# knn
knnout <- ovalogtrn(pef1,"occ",25)
preds <- predict(knnout,predpts=pef1[,-3])
mean(preds == factorTo012etc(pef1$occ)) # about 0.43
data(oliveoils)
oo <- oliveoils
oo <- oo[,-1]
knnout <- ovaknntrn(oo,'Region',10)
# predict a new case that is like oo1[1,] but with palmitic = 950
newx <- oo[1,2:9,drop=FALSE]
newx[,1] <- 950
predict(knnout,predpts=newx) # predicts class 2, South
## End(Not run)