RenDim {IDmining}R Documentation

Renyi's Generalized Dimensions

Description

Estimates Rényi's generalized dimensions (or Rényi's dimensions of qthqth order). It is mainly for q=2q=2 that the result is used as an estimate of the intrinsic dimension of data.

Usage

RenDim(X, scaleQ=1:5, qMin=2, qMax=2)

Arguments

X

A N×EN \times E matrix, data.frame or data.table where NN is the number of data points and EE is the number of variables (or features). Each variable is rescaled to the [0,1][0,1] interval by the function.

scaleQ

A vector (at least two values). It contains the values of 1\ell^{-1} chosen by the user (by default: scaleQ = 1:5).

qMin

The minimum value of qq (by default: qMin = 2).

qMax

The maximum value of qq (by default: qMax = 2).

Details

  1. \ell is the edge length of the grid cells (or quadrats). Since the variables (and consenquently the grid) are rescaled to the [0,1][0,1] interval, \ell is equal to 11 for a grid consisting of only one cell.

  2. 1\ell^{-1} is the number of grid cells (or quadrats) along each axis of the Euclidean space in which the data points are embedded.

  3. 1\ell^{-1} is equal to Q(1/E)Q^{(1/E)} where QQ is the number of grid cells and EE is the number of variables (or features).

  4. 1\ell^{-1} is directly related to δ\delta (see References).

  5. δ\delta is the diagonal length of the grid cells.

Value

A list of two elements:

  1. a data.frame containing the value of Rényi's information of qthqth order (computed using the natural logarithm) for each value of ln(δ)\ln (\delta) and qq. The values of ln(δ)\ln (\delta) are provided with regard to the [0,1][0,1] interval.

  2. a data.frame containing the value of DqD_q for each value of qq.

Author(s)

Jean Golay jeangolay@gmail.com

References

C. Traina Jr., A. J. M. Traina, L. Wu and C. Faloutsos (2000). Fast feature selection using fractal dimension. Proceedings of the 15th Brazilian Symposium on Databases (SBBD 2000), João Pessoa (Brazil).

E. P. M. De Sousa, C. Traina Jr., A. J. M. Traina, L. Wu and C. Faloutsos (2007). A fast and effective method to find correlations among attributes in databases, Data Mining and Knowledge Discovery 14(3):367-407.

J. Golay and M. Kanevski (2015). A new estimator of intrinsic dimension based on the multipoint Morisita index, Pattern Recognition 48 (12):4070–4081.

H. Hentschel and I. Procaccia (1983). The infinite number of generalized dimensions of fractals and strange attractors, Physica D 8(3):435-444.

Examples

sim_dat <- SwissRoll(1000)

scaleQ <- 1:15 # It starts with a grid of 1^E cell (or quadrat).
               # It ends with a grid of 15^E cells (or quadrats).
qRI_ID <- RenDim(sim_dat[,c(1,2)], scaleQ[5:15])

print(paste("The ID estimate is equal to",round(qRI_ID[[1]][1,2],2)))

[Package IDmining version 1.0.7 Index]