complexity_analysis {shattering}R Documentation

Produce a PDF report analyzing the lower and upper shattering coefficient functions

Description

Full analysis on the lower and upper shattering coefficient functions for a given supervised dataset

Usage

complexity_analysis(
  X = NULL,
  Y = NULL,
  my.delta = 0.05,
  my.epsilon = 0.05,
  directory = tempdir(),
  file = "myreport",
  length = 10,
  quantile.percentage = 0.5,
  epsilon = 1e-07
)

Arguments

X

matrix defining the input space of your dataset

Y

numerical vector defining the output space (labels/classes) of your dataset

my.delta

upper bound for the probability of the empirical risk minimization principle (in range (0,1))

my.epsilon

acceptable divergence between the empirical and (expected) risks (in range (0,1))

directory

directory used to generate the report for your dataset

file

name of the PDF file to be generated (without extension)

length

number of points to divide the sample while computing the shattering coefficient

quantile.percentage

real number to define the quantile of distances to be considered (e.g. 0.1 means 10%)

epsilon

a real threshold to be removed from distances in order to measure the open balls in the underlying topology

Value

A list including the number of hyperplanes and the shattering coefficient function. A report is generated in the user-defined directory.

References

de Mello, R.F. (2019) "On the Shattering Coefficient of Supervised Learning Algorithms" arXiv:https://arxiv.org/abs/1911.05461

de Mello, R.F., Ponti, M.A. (2018, ISBN: 978-3319949888) "Machine Learning: A Practical Approach on the Statistical Learning Theory"

Examples


# Analyzing the complexity of the shattering coefficients functions 
# 	(lower and upper bounds) for the Iris dataset
# require(datasets)
# complexity_analysis(X=as.matrix(iris[,1:4]), Y=as.numeric(iris[,5]))

[Package shattering version 1.0.7 Index]