discdd.misclass {dad} | R Documentation |
Misclassification ratio in functional discriminant analysis of discrete probability distributions.
Description
Computes the one-leave-out misclassification ratio of the rule assigning T
groups of individuals, one group after another, to the class of groups (among K
classes of groups) which achieves the minimum of the distances or divergences between the probability distribution associated to the group to assign and the K
probability distributions associated to the K
classes.
Usage
discdd.misclass(xf, class.var, distance = c("l1", "l2", "chisqsym", "hellinger",
"jeffreys", "jensen", "lp"), crit = 1, p)
Arguments
xf |
object of class
|
class.var |
string (if
|
distance |
The distance or dissimilarity used to compute the distance matrix between the densities. It can be:
|
crit |
1 or 2. In order to select the densities associated to the classes. See Details. |
p |
integer. Optional. When |
Details
If
xf
is an object of class"folderh"
containing the data:The
T
probability distributionsf_t
corresponding to theT
groups of individuals are estimated by frequency distributions within each group.To the class
k
consisting ofT_k
groups is associated the probability distributiong_k
, knowing that when using the one-leave-out method, we do not include the group to assign in its classk
. Thecrit
argument selects the estimation method of theg_k
's.crit=1
The probability distributiong_k
is estimated using the whole data of this class, that is the rows ofx
corresponding to theT_k
groups of the classk
.The estimation of the
g_k
's uses the same method as the estimation of thef_t
's.crit=2
TheT_k
probability distributionsf_t
are estimated using the corresponding data fromxf
. Then they are averaged to obtain an estimation of the densityg_k
, that isg_k = \frac{1}{T_k} \, \sum{f_t}
.
If
xf
is a list of arrays (or list of tables):The
t^{th}
array is the joint frequency distribution of thet^{th}
group. The frequencies can be absolute or relative.To the class
k
consisting ofT_k
groups is associated the probability distributiong_k
, knowing that when using the one-leave-out method, we do not include the group to assign in its classk
. Thecrit
argument selects the estimation method of theg_k
's.crit=1
g_k = \frac{1}{\sum n_t} \sum n_t f_t
, wheren_t
is the total ofxf[[t]]
.Notice that when
xf[[t]]
contains relative frequencies, its total is 1. That is equivalent tocrit=2
.crit=2
g_k = \frac{1}{T_k} \, \sum f_t
.
Value
Returns an object of class discdd.misclass
, that is a list including:
classification |
data frame with 4 columns:
|
confusion.mat |
confusion matrix, |
misalloc.per.class |
the misclassification ratio per class, |
misclassed |
the misclassification ratio, |
distances |
matrix with |
proximities |
matrix of the proximity indices (in percents) between the groups and the classes. The proximity between the group |
Author(s)
Rachid Boumaza, Pierre Santagostini, Smail Yousfi, Gilles Hunault, Sabine Demotes-Mainard
References
Rudrauf, J.M., Boumaza, R. (2001). Contribution à l'étude de l'architecture médiévale: les caractéristiques des pierres à bossage des châteaux forts alsaciens, Centre de Recherches Archéologiques médiévales de Saverne, 5, 5-38.
Examples
# Example 1 with a folderh obtained by converting numeric variables
data("castles.dated")
stones <- castles.dated$stones
periods <- castles.dated$periods
stones$height <- cut(stones$height, breaks = c(19, 27, 40, 71), include.lowest = TRUE)
stones$width <- cut(stones$width, breaks = c(24, 45, 62, 144), include.lowest = TRUE)
stones$edging <- cut(stones$edging, breaks = c(0, 3, 4, 8), include.lowest = TRUE)
stones$boss <- cut(stones$boss, breaks = c(0, 6, 9, 20), include.lowest = TRUE )
castlefh <- folderh(periods, "castle", stones)
# Default: dist="l1", crit=1
discdd.misclass(castlefh, "period")
# Hellinger distance, crit=2
discdd.misclass(castlefh, "period", distance = "hellinger", crit = 2)
# Example 2 with a list of 96 arrays
data("dspgd2015")
data("departments")
classes <- departments[, c("coded", "namer")]
names(classes) <- c("group", "class")
# Default: dist="l1", crit=1
discdd.misclass(dspgd2015, classes)
# Hellinger distance, crit=2
discdd.misclass(dspgd2015, classes, distance = "hellinger", crit = 2)