| twonn {intRinsic} | R Documentation |
TWO-NN estimator
Description
The function can fit the two-nearest neighbor estimator within the maximum
likelihood and the Bayesian frameworks. Also, one can obtain the estimates
using least squares estimation, depending on the specification of the
argument method. This model has been originally presented in
Facco et al., 2017
. See also Denti et al., 2022
for more details.
Usage
twonn(
X = NULL,
dist_mat = NULL,
mus = NULL,
method = c("mle", "linfit", "bayes"),
alpha = 0.95,
c_trimmed = 0.01,
unbiased = TRUE,
a_d = 0.001,
b_d = 0.001,
...
)
## S3 method for class 'twonn_bayes'
print(x, ...)
## S3 method for class 'twonn_bayes'
summary(object, ...)
## S3 method for class 'summary.twonn_bayes'
print(x, ...)
## S3 method for class 'twonn_bayes'
plot(x, plot_low = 0.001, plot_upp = NULL, by = 0.05, ...)
## S3 method for class 'twonn_linfit'
print(x, ...)
## S3 method for class 'twonn_linfit'
summary(object, ...)
## S3 method for class 'summary.twonn_linfit'
print(x, ...)
## S3 method for class 'twonn_linfit'
plot(x, ...)
## S3 method for class 'twonn_mle'
print(x, ...)
## S3 method for class 'twonn_mle'
summary(object, ...)
## S3 method for class 'summary.twonn_mle'
print(x, ...)
## S3 method for class 'twonn_mle'
plot(x, ...)
Arguments
X |
data matrix with |
dist_mat |
distance matrix computed between the |
mus |
vector of second to first NN distance ratios. |
method |
chosen estimation method. It can be
|
alpha |
the confidence level (for |
c_trimmed |
the proportion of trimmed observations. |
unbiased |
logical, applicable when |
a_d |
shape parameter of the Gamma prior on the parameter |
b_d |
rate parameter of the Gamma prior on the parameter |
... |
ignored. |
x |
object of class |
object |
object of class |
plot_low |
lower bound of the interval on which the posterior density is plotted. |
plot_upp |
upper bound of the interval on which the posterior density is plotted. |
by |
step-size at which the sequence spanning the interval is incremented. |
Value
list characterized by a class type that depends on the method
chosen. Regardless of the method, the output list always contains the
object est, which provides the estimated intrinsic dimension along
with uncertainty quantification. The remaining objects vary with the
estimation method. In particular, if
method = "mle"the output reports the MLE and the relative confidence interval;
method = "linfit"the output includes the
lm()object used for the computation;method = "bayes"the output contains the (1 +
alpha) / 2 and (1 -alpha) / 2 quantiles, mean, mode, and median of the posterior distribution ofd.
References
Facco E, D'Errico M, Rodriguez A, Laio A (2017). "Estimating the intrinsic dimension of datasets by a minimal neighborhood information." Scientific Reports, 7(1). ISSN 20452322, doi:10.1038/s41598-017-11873-y.
Denti F, Doimo D, Laio A, Mira A (2022). "The generalized ratios intrinsic dimension estimator." Scientific Reports, 12(20005). ISSN 20452322, doi:10.1038/s41598-022-20991-1.
Examples
# dataset with 1000 observations and id = 2
X <- replicate(2,rnorm(1000))
twonn(X)
# dataset with 1000 observations and id = 3
Y <- replicate(3,runif(1000))
# Bayesian and least squares estimate from distance matrix
dm <- as.matrix(dist(Y,method = "manhattan"))
twonn(dist_mat = dm,method = "bayes")
twonn(dist_mat = dm,method = "linfit")