twonn_decimation {intRinsic}R Documentation

Estimate the decimated TWO-NN evolution with halving steps or vector of proportions

Description

The estimation of the id is related to the scale of the dataset. To escape the local reach of the TWO-NN estimator, Facco et al. (2017) proposed to subsample the original dataset in order to induce greater distances between the data points. By investigating the estimates' evolution as a function of the size of the neighborhood, it is possible to obtain information about the validity of the modeling assumptions and the robustness of the model in the presence of noise.

Usage

twonn_decimation(
  X,
  method = c("steps", "proportions"),
  steps = 0,
  proportions = 1,
  seed = NULL
)

## S3 method for class 'twonn_dec_prop'
print(x, ...)

## S3 method for class 'twonn_dec_prop'
plot(x, CI = FALSE, proportions = FALSE, ...)

## S3 method for class 'twonn_dec_by'
print(x, ...)

## S3 method for class 'twonn_dec_by'
plot(x, CI = FALSE, steps = FALSE, ...)

Arguments

X

data matrix with n observations and D variables.

method

method to use for decimation:

"steps"

the number of times the dataset is halved;

"proportion"

the dataset is subsampled according to a vector of proportions.

steps

logical, if TRUE, the x-axis reports the number of halving steps. If FALSE, the x-axis reports the log10 average distance.

proportions

logical, if TRUE, the x-axis reports the number of decimating proportions. If FALSE, the x-axis reports the log10 average distance.

seed

random seed controlling the sequence of sub-sampled observations.

x

object of class twonn_dec_prop, obtained from the function twonn_dec_prop().

...

ignored.

CI

logical, if TRUE, the confidence intervals are plotted

Value

list containing the TWO-NN evolution (maximum likelihood estimation and confidence intervals), the average distance from the second NN, and the vector of proportions that were considered. According to the chosen estimation method, it is accompanied with the vector of proportions or halving steps considered.

References

Facco E, D'Errico M, Rodriguez A, Laio A (2017). "Estimating the intrinsic dimension of datasets by a minimal neighborhood information." Scientific Reports, 7(1). ISSN 20452322, doi:10.1038/s41598-017-11873-y.

Denti F, Doimo D, Laio A, Mira A (2022). "The generalized ratios intrinsic dimension estimator." Scientific Reports, 12(20005). ISSN 20452322, doi:10.1038/s41598-022-20991-1.

See Also

twonn

Examples

X <- replicate(4,rnorm(1000))
twonn_decimation(X,,method = "proportions",
                proportions = c(1,.5,.2,.1,.01))


[Package intRinsic version 1.0.2 Index]