twonn {intRinsic} | R Documentation |
TWO-NN
estimator
Description
The function can fit the two-nearest neighbor estimator within the maximum
likelihood and the Bayesian frameworks. Also, one can obtain the estimates
using least squares estimation, depending on the specification of the
argument method
. This model has been originally presented in
Facco et al., 2017
. See also Denti et al., 2022
for more details.
Usage
twonn(
X = NULL,
dist_mat = NULL,
mus = NULL,
method = c("mle", "linfit", "bayes"),
alpha = 0.95,
c_trimmed = 0.01,
unbiased = TRUE,
a_d = 0.001,
b_d = 0.001,
...
)
## S3 method for class 'twonn_bayes'
print(x, ...)
## S3 method for class 'twonn_bayes'
summary(object, ...)
## S3 method for class 'summary.twonn_bayes'
print(x, ...)
## S3 method for class 'twonn_bayes'
plot(x, plot_low = 0.001, plot_upp = NULL, by = 0.05, ...)
## S3 method for class 'twonn_linfit'
print(x, ...)
## S3 method for class 'twonn_linfit'
summary(object, ...)
## S3 method for class 'summary.twonn_linfit'
print(x, ...)
## S3 method for class 'twonn_linfit'
plot(x, ...)
## S3 method for class 'twonn_mle'
print(x, ...)
## S3 method for class 'twonn_mle'
summary(object, ...)
## S3 method for class 'summary.twonn_mle'
print(x, ...)
## S3 method for class 'twonn_mle'
plot(x, ...)
Arguments
X |
data matrix with |
dist_mat |
distance matrix computed between the |
mus |
vector of second to first NN distance ratios. |
method |
chosen estimation method. It can be
|
alpha |
the confidence level (for |
c_trimmed |
the proportion of trimmed observations. |
unbiased |
logical, applicable when |
a_d |
shape parameter of the Gamma prior on the parameter |
b_d |
rate parameter of the Gamma prior on the parameter |
... |
ignored. |
x |
object of class |
object |
object of class |
plot_low |
lower bound of the interval on which the posterior density is plotted. |
plot_upp |
upper bound of the interval on which the posterior density is plotted. |
by |
step-size at which the sequence spanning the interval is incremented. |
Value
list characterized by a class type that depends on the method
chosen. Regardless of the method
, the output list always contains the
object est
, which provides the estimated intrinsic dimension along
with uncertainty quantification. The remaining objects vary with the
estimation method. In particular, if
method = "mle"
the output reports the MLE and the relative confidence interval;
method = "linfit"
the output includes the
lm()
object used for the computation;method = "bayes"
the output contains the (1 +
alpha
) / 2 and (1 -alpha
) / 2 quantiles, mean, mode, and median of the posterior distribution ofd
.
References
Facco E, D'Errico M, Rodriguez A, Laio A (2017). "Estimating the intrinsic dimension of datasets by a minimal neighborhood information." Scientific Reports, 7(1). ISSN 20452322, doi:10.1038/s41598-017-11873-y.
Denti F, Doimo D, Laio A, Mira A (2022). "The generalized ratios intrinsic dimension estimator." Scientific Reports, 12(20005). ISSN 20452322, doi:10.1038/s41598-022-20991-1.
Examples
# dataset with 1000 observations and id = 2
X <- replicate(2,rnorm(1000))
twonn(X)
# dataset with 1000 observations and id = 3
Y <- replicate(3,runif(1000))
# Bayesian and least squares estimate from distance matrix
dm <- as.matrix(dist(Y,method = "manhattan"))
twonn(dist_mat = dm,method = "bayes")
twonn(dist_mat = dm,method = "linfit")