R: Produce a PDF report analyzing the lower and upper shattering...

complexity_analysis {shattering}

R Documentation

Produce a PDF report analyzing the lower and upper shattering coefficient functions

Description

Full analysis on the lower and upper shattering coefficient functions for a given supervised dataset

Usage

complexity_analysis(
  X = NULL,
  Y = NULL,
  my.delta = 0.05,
  my.epsilon = 0.05,
  directory = tempdir(),
  file = "myreport",
  length = 10,
  quantile.percentage = 0.5,
  epsilon = 1e-07
)

Arguments

`X`	matrix defining the input space of your dataset
`Y`	numerical vector defining the output space (labels/classes) of your dataset
`my.delta`	upper bound for the probability of the empirical risk minimization principle (in range (0,1))
`my.epsilon`	acceptable divergence between the empirical and (expected) risks (in range (0,1))
`directory`	directory used to generate the report for your dataset
`file`	name of the PDF file to be generated (without extension)
`length`	number of points to divide the sample while computing the shattering coefficient
`quantile.percentage`	real number to define the quantile of distances to be considered (e.g. 0.1 means 10%)
`epsilon`	a real threshold to be removed from distances in order to measure the open balls in the underlying topology

Value

A list including the number of hyperplanes and the shattering coefficient function. A report is generated in the user-defined directory.

References

de Mello, R.F. (2019) "On the Shattering Coefficient of Supervised Learning Algorithms" arXiv:https://arxiv.org/abs/1911.05461

de Mello, R.F., Ponti, M.A. (2018, ISBN: 978-3319949888) "Machine Learning: A Practical Approach on the Statistical Learning Theory"

Examples


# Analyzing the complexity of the shattering coefficients functions 
# 	(lower and upper bounds) for the Iris dataset
# require(datasets)
# complexity_analysis(X=as.matrix(iris[,1:4]), Y=as.numeric(iris[,5]))

[Package shattering version 1.0.7 Index]