gendata {ider} | R Documentation |
Data generator for intrinsic dimension estimation.
Description
gendata
generates various artificial datasets for intrinsic dimension estimation experiments.
Usage
gendata(
DataName = "SwissRoll",
n = 300,
p = NULL,
noise = NULL,
ol = NULL,
curv = 1,
seed = 123,
sorted = FALSE
)
Arguments
DataName |
Name of dataset, one of the following:
|
n |
number of data points to be generated. |
p |
ambient dimension of the dataset. |
noise |
parameter to control noise level in the dataset. In many cases,
it is used for |
ol |
percentage of outliers, i.e., n * ol outliers are added to the generated dataset. |
curv |
a parameter to control the complexity of the embedded manifold. |
seed |
random number seed. |
sorted |
logical. If |
Details
This function generates various artificial datasets often used in
manifold learning and dimension estimation researches.
For some datasets, complexity of the shape is controlled by the parameter curv
.
The parameters noise
and outlier
are used for adding noise and/or
outliers for the dataset.
Value
Data matrix. For ldbl
dataset, it outputs a list composed of
x
: data matrix and tDim
: true intrinsic dimension for each point.
Author(s)
Hideitsu Hino hideitsu.hino@gmail.com
Examples
## global intrinsic dimension estimate
x <- gendata(DataName='SwissRoll')
estmle <- lbmle(x=x,k1=3,k2=5)
print(estmle)
## local intrinsic dimension estimate
tmp <- gendata(DataName='ldbl',n=1000)
x <- tmp$x
estmada <- mada(x=x,local=TRUE)
head(estmada) ## estimated local intrinsic dimensions
head(tmp$tDim) ## true local intrinsic dimensions