cnDiscretize {catnet} R Documentation

## Data Categorization

### Description

Numerical data discretization using empirical quantiles.

### Usage

```cnDiscretize(data, numCategories, mode="uniform", qlevels=NULL)
```

### Arguments

 `data` a numerical `matrix` or `data.frame` `numCategories` an `integer`, the number of categories per node `mode` a `character`, the discretization method to be used, "quantile" or "uniform" `qlevels` a list of `integer` vectors, the node discretization parameters

### Details

The numerical `data` is discretized into given number of categories, `numCategories`, using the empirical node quantiles. As in all functions of `catnet` package that accept data, if the `data` parameter is a `matrix` then it is organized in the row-node format. If it is a `data.frame`, the column-node format is assumed.

The `mode` specifies the discretization model. Currantly, two discretization methods are supported - "quantile" and "uniform", which is the default choice.

The quantile-based discretization method is applied as follows. For each node, the sample node distribution is constructed, which is then represented by a sum of non-intersecting classes separated by the quantile points of the sample distribution. Each node value is assigned the class index in which it falls into.

The uniform discretization breaks the range of values of each node into `numCategories` equal intervals or of lengths proportional to the corresponding `qlevels` values.

Currently, the function assigns equal number of categories for each node of the data.

### Value

A `matrix` or `data.frame` of indices.

### Author(s)

N. Balov, P. Salzman

`cnSamples`

### Examples

```  ps <- t(sapply(1:10, function(i) rnorm(20, i, 0.1)))
dps1 <- cnDiscretize(ps, 3, mode="quantile")
hist(dps1[1,])
qlevels <- lapply(1:10, function(i) rep(1, 3))
qlevels[[1]] <- c(1,2,1)
dps2 <- cnDiscretize(ps, 3, mode="uniform", qlevels)
hist(dps2[1,])
```

[Package catnet version 1.15.7 Index]