cccd {cccd}R Documentation

Class Cover Catch Digraph

Description

Constructs a class cover catch digraph from points or interpoint distance matrices.

Usage

cccd(x = NULL, y = NULL, dxx = NULL, dyx = NULL, method = NULL, 
     k = NA, algorithm = 'cover_tree')
cccd.rw(x=NULL,y=NULL,dxx=NULL,dyx=NULL,method=NULL,m=1,d=2)
cccd.classifier(x,y,dom.method='greedy',proportion=1,...)
cccd.classify(data, C,method=NULL)
cccd.classifier.rw(x,y,m=1,d=2)
cccd.multiclass.classifier(data, classes, dom.method='greedy',proportion=1,...)
cccd.multiclass.classify(data,C,method=NULL)
## S3 method for class 'cccd'
plot(x, ..., plot.circles = FALSE, dominate.only = FALSE, 
          D = NULL, vertex.size = 2, vertex.label = NA, 
			 vertex.color = "SkyBlue2", dom.color = "Blue", 
			 ypch = 20, ycex = 1.5, ycol = 2, 
			 use.circle.radii = FALSE, balls = FALSE, 
			 ball.color = gray(0.8), square = FALSE, xlim, ylim)
## S3 method for class 'cccdClassifier'
plot(x, ..., xcol=1,ycol=2,xpch=20,ypch=xpch,
                                balls=FALSE,add=FALSE)

Arguments

x, y

the target class and non-target class points. Either x,y or dxx,dyx must be provided. In the case of plot, x is an object of class cccd.

dxx, dyx

interpoint distances (x against x and y against x). If these are not provided they are computed using x and y.

method

the method used for the distance. See dist.

dom.method, proportion

the method used for the domination set computation, and the proportion of points required to dominate. See dominate.

k

If given, get.knn is used from FNN to approximate the class cover catch graph. Each x covers no more than the k nearest neighbors to it. This will be much faster and use less memory for large data sets, but is an approximation unless k is sufficiently large.

algorithm

See get.knn.

m

slope of the null hypothesis curve

data

data to be classified

d

dimension of the data

classes

class labels of the data

C

cccd object

plot.circles

logical. Plot the circles around the points if TRUE.

dominate.only

logical. Only plot the digraph induced by the dominating set.

D

a dominating set. Only used if dominate.only is TRUE. If dominate.only is TRUE and D is NULL, then dominate is called.

vertex.size, vertex.color, vertex.label, dom.color

parameters controling the plotting of the vertices. dom.color is the color of the vertices in the dominating set.

balls, ball.color

if balls=TRUE, the cover is plotted as filled balls, with ball.color controling their color. In the cass of cccdClassifier, balls can be "x" or "y" indicating that only one of the balls should be plotted.

ypch, ycex, ycol

parameters for plotting the non-target points.

xpch, xcol

parameters for plotting the first class points.

add

logical. Should the classifier plot be added to an existing plot?

use.circle.radii

logical. Ensure that the circles fit within the plot.

square

logical. Make the plot square.

xlim, ylim

if present, these control the plotting region.

...

arguments passed to cccd or plot.

Details

The class cover catch digraph is a graph with vertices defined by the points of x and edges defined according to the balls B(x,d(x,Y)). There is an edge between vertices x_1,x_2 if x_2\in B(x_1,d(x_1,Y)). If dyx is not given and the method is 'euclidean', then get.knnx is used to find the nearest y to each x. If k is given, only the k nearest neighbors to each point are candidates for covering. Thus the cccd will be approximate, but the computation will (generally) be faster. Since get.knn uses Euclidean distance, these choices will only be valid for this distance metric. Since the graph will tend to be larger than otherwise, the dominating set computation will be slower, so one should trade-off speed of calculation, approximation, and the proportion option to the dominating set (which can make that calculation faster at the cost of returning a subset of the dominating set).

Value

an object of class igraph. In addition, it contains the attributes:

R

a vector of radii.

Y

the y vectors.

layout

the x vectors.

In the case of the classifier, the attributes are:

Rx, Ry

vectors of radii.

Cx, Cy

the ball centers.

Note

The plotting assumes the cccd used Euclidean distance, and so the balls/circles will be Euclidean balls/circles. If the method used in the distance was some other metric, you'll have to plot the balls/circles yourself if you want them to be correct on the plot.

Author(s)

David J. Marchette, david.marchette@navy.mil

References

D.J. Marchette, "Class Cover Catch Digraphs", Wiley Interdisciplinary Reviews: Computational Statistics, 2, 171-177, 2010.

D.J. Marchette, Random Graphs for Statistical Pattern Recognition, John Wiley & Sons, 2004.

C.E. Priebe, D.J. Marchette, J. DeVinney and D. Socolinsky, "Classification Using Class Cover Catch Digraphs", Journal of Classification, 20, 3-23, 2003.

See Also

ccd, rng, gg, dist, get.knn dominate

Examples

set.seed(456330)
z <- matrix(runif(1000),ncol=2)
ind <- which(z[,1]<.5 & z[,2]<.5)
x <- z[ind,]
y <- z[-ind,]
g <- cccd(x,y)
C <- cccd.classifier(x,y)
z2 <- matrix(runif(1000),ncol=2)
ind <- which(z2[,1]<.5 & z2[,2]<.5)
cls <- rep(0,nrow(z2))
cls[ind] <- 1
out <- cccd.classify(z2,C)
sum(out != cls)/nrow(z2)
## Not run: 
plot(g,plot.circles=TRUE,dominate.only=TRUE)
points(z2,col=2*(1-cls)+1,pch=20)

## End(Not run)

[Package cccd version 1.6 Index]