gride {intRinsic}R Documentation

Gride: the Generalized Ratios ID Estimator

Description

The function can fit the Generalized ratios ID estimator under both the frequentist and the Bayesian frameworks, depending on the specification of the argument method. The model is the direct extension of the TWO-NN method presented in Facco et al., 2017 . See also Denti et al., 2022 \ for more details.

Usage

gride(
  X = NULL,
  dist_mat = NULL,
  mus_n1_n2 = NULL,
  method = c("mle", "bayes"),
  n1 = 1,
  n2 = 2,
  alpha = 0.95,
  nsim = 5000,
  upper_D = 50,
  burn_in = 2000,
  sigma = 0.5,
  start_d = NULL,
  a_d = 1,
  b_d = 1,
  ...
)

## S3 method for class 'gride_bayes'
print(x, ...)

## S3 method for class 'gride_bayes'
summary(object, ...)

## S3 method for class 'summary.gride_bayes'
print(x, ...)

## S3 method for class 'gride_bayes'
plot(x, ...)

## S3 method for class 'gride_mle'
print(x, ...)

## S3 method for class 'gride_mle'
summary(object, ...)

## S3 method for class 'summary.gride_mle'
print(x, ...)

## S3 method for class 'gride_mle'
plot(x, ...)

Arguments

X

data matrix with n observations and D variables.

dist_mat

distance matrix computed between the n observations.

mus_n1_n2

vector of generalized order NN distance ratios.

method

the chosen estimation method. It can be

"mle"

maximum likelihood estimation;

"bayes"

estimation with the Bayesian approach.

n1

order of the first NN considered. Default is 1.

n2

order of the second NN considered. Default is 2.

alpha

confidence level (for mle) or posterior probability in the credible interval (bayes).

nsim

number of bootstrap samples or posterior simulation to consider.

upper_D

nominal dimension of the dataset (upper bound for the maximization routine).

burn_in

number of iterations to discard from the MCMC sample. Applicable if method = "bayes".

sigma

standard deviation of the Gaussian proposal used in the MH step. Applicable if method = "bayes".

start_d

initial value for the MCMC chain. If NULL, the MLE is used. Applicable if method = "bayes".

a_d

shape parameter of the Gamma prior distribution for d. Applicable if method = "bayes".

b_d

rate parameter of the Gamma prior distribution for d. Applicable if method = "bayes".

...

other arguments passed to specific methods.

x

object of class gride_mle. It is obtained using the output of the gride function when method = "mle".

object

object of class gride_mle, obtained from the function gride_mle().

Value

a list containing the id estimate obtained with the Gride method, along with the relative confidence or credible interval (object est). The class of the output object changes according to the chosen method. Similarly, the remaining elements stored in the list reports a summary of the key quantities involved in the estimation process, e.g., the NN orders n1 and n2.

References

Facco E, D'Errico M, Rodriguez A, Laio A (2017). "Estimating the intrinsic dimension of datasets by a minimal neighborhood information." Scientific Reports, 7(1). ISSN 20452322, doi:10.1038/s41598-017-11873-y.

Denti F, Doimo D, Laio A, Mira A (2022). "The generalized ratios intrinsic dimension estimator." Scientific Reports, 12(20005). ISSN 20452322, doi:10.1038/s41598-022-20991-1.

Examples


 X  <- replicate(2,rnorm(500))
 dm <- as.matrix(dist(X,method = "manhattan"))
 res <- gride(X, nsim = 500)
 res
 plot(res)
 gride(dist_mat = dm, method = "bayes", upper_D =10,
 nsim = 500, burn_in = 100)


[Package intRinsic version 1.0.2 Index]