| gendata {ider} | R Documentation |
Data generator for intrinsic dimension estimation.
Description
gendata generates various artificial datasets for intrinsic dimension estimation experiments.
Usage
gendata(
DataName = "SwissRoll",
n = 300,
p = NULL,
noise = NULL,
ol = NULL,
curv = 1,
seed = 123,
sorted = FALSE
)
Arguments
DataName |
Name of dataset, one of the following:
|
n |
number of data points to be generated. |
p |
ambient dimension of the dataset. |
noise |
parameter to control noise level in the dataset. In many cases,
it is used for |
ol |
percentage of outliers, i.e., n * ol outliers are added to the generated dataset. |
curv |
a parameter to control the complexity of the embedded manifold. |
seed |
random number seed. |
sorted |
logical. If |
Details
This function generates various artificial datasets often used in
manifold learning and dimension estimation researches.
For some datasets, complexity of the shape is controlled by the parameter curv.
The parameters noise and outlier are used for adding noise and/or
outliers for the dataset.
Value
Data matrix. For ldbl dataset, it outputs a list composed of
x: data matrix and tDim: true intrinsic dimension for each point.
Author(s)
Hideitsu Hino hideitsu.hino@gmail.com
Examples
## global intrinsic dimension estimate
x <- gendata(DataName='SwissRoll')
estmle <- lbmle(x=x,k1=3,k2=5)
print(estmle)
## local intrinsic dimension estimate
tmp <- gendata(DataName='ldbl',n=1000)
x <- tmp$x
estmada <- mada(x=x,local=TRUE)
head(estmada) ## estimated local intrinsic dimensions
head(tmp$tDim) ## true local intrinsic dimensions