complexity_analysis {shattering} | R Documentation |
Produce a PDF report analyzing the lower and upper shattering coefficient functions
Description
Full analysis on the lower and upper shattering coefficient functions for a given supervised dataset
Usage
complexity_analysis(
X = NULL,
Y = NULL,
my.delta = 0.05,
my.epsilon = 0.05,
directory = tempdir(),
file = "myreport",
length = 10,
quantile.percentage = 0.5,
epsilon = 1e-07
)
Arguments
X |
matrix defining the input space of your dataset |
Y |
numerical vector defining the output space (labels/classes) of your dataset |
my.delta |
upper bound for the probability of the empirical risk minimization principle (in range (0,1)) |
my.epsilon |
acceptable divergence between the empirical and (expected) risks (in range (0,1)) |
directory |
directory used to generate the report for your dataset |
file |
name of the PDF file to be generated (without extension) |
length |
number of points to divide the sample while computing the shattering coefficient |
quantile.percentage |
real number to define the quantile of distances to be considered (e.g. 0.1 means 10%) |
epsilon |
a real threshold to be removed from distances in order to measure the open balls in the underlying topology |
Value
A list including the number of hyperplanes and the shattering coefficient function. A report is generated in the user-defined directory.
References
de Mello, R.F. (2019) "On the Shattering Coefficient of Supervised Learning Algorithms" arXiv:https://arxiv.org/abs/1911.05461
de Mello, R.F., Ponti, M.A. (2018, ISBN: 978-3319949888) "Machine Learning: A Practical Approach on the Statistical Learning Theory"
Examples
# Analyzing the complexity of the shattering coefficients functions
# (lower and upper bounds) for the Iris dataset
# require(datasets)
# complexity_analysis(X=as.matrix(iris[,1:4]), Y=as.numeric(iris[,5]))