R: The number of different (unique) examples in a dataset

diffExamples {RatingScaleReduction}

R Documentation

The number of different (unique) examples in a dataset

Description

Datasets often contain replications. In particular, one example may be replicated n times, where n is the total number of examples, so that there are no other examples. Such situation would deviate computations and should be early detected. Ideally, no example should be replicated but if the rate is small, we can progress to computing AUC.

Usage

diffExamples(attribute)

Arguments

attribute

a matrix or data.frame containing attributes

Value

`total.examples`	a number of examples in a data
`diff.examples`	a number of different examples in a data
`dup.exapmles`	a number of duplicate examples in a data

Author(s)

Waldemar W. Koczkodaj, Feng Li,Alicja Wolny-Dominiak

Examples

#creating the matrix of attributes and the decision vector
#must be as.numeric()
data(aSAH)
attach(aSAH)
is.numeric(aSAH)

attribute <-data.frame(as.numeric(gender), 
as.numeric(age), as.numeric(wfns), as.numeric(s100b), as.numeric(ndka))
colnames(attribute) <-c("a1", "a2", "a3", "a4", "a5")

#show the number of different examples
diffExamples(attribute)

[Package RatingScaleReduction version 1.4 Index]