Kira-package {Kira} | R Documentation |
Machine learning and data mining.
Description
Machine learning, containing several algorithms, in addition to functions that plot the graphs of the Receiver Operating Characteristic (ROC) and Precision-Recall (PRC) curve, and also a function that returns several metrics used to evaluate the models, the latter can be used in the classification results of other packages.
Details
Package: | Kira |
Type: | Package |
Version: | 1.0.5 |
Date: | 2024-07-04 |
License: | GPL(>= 3) |
LazyLoad: | yes |
This package contains:
Algorithms for supervised classification: knn, linear (lda) and quadratic (qda) discriminant analysis, linear regression, etc.
Algorithms for unsupervised classification: hierarchical, kmeans, etc.
A function that plots the ROC and PRC curve.
A function that returns a series of metrics from models.
Functions that determine the ideal number of clusters: elbow and silhouette.
Author(s)
Paulo Cesar Ossani <ossanipc@hotmail.com>
References
Aha, D. W.; Kibler, D. and Albert, M. K. Instance-based learning algorithms. Machine learning. v.6, n.1, p.37-66. 1991.
Anitha, S.; Metilda, M. A. R. Y. An extensive investigation of outlier detection by cluster validation indices. Ciencia e Tecnica Vitivinicola - A Science and Technology Journal, v. 34, n. 2, p. 22-32, 2019. doi: 10.13140/RG.2.2.26801.63848
Charnet, R. at al. Analise de modelos de regressao lienar, 2a ed. Campinas: Editora da Unicamp, 2008. 357 p.
Chicco, D.; Warrens, M. J. and Jurman, G. The matthews correlation coefficient (mcc) is more informative than cohen's kappa and brier score in binary classification assessment. IEEE Access, IEEE, v. 9, p. 78368-78381, 2021.
Erich, S. Stop using the Elbow criterion for k-means and how to choose the number of clusters instead. ACM SIGKDD Explorations Newsletter. 25 (1): 36-42. arXiv:2212.12189. 2023. doi: 10.1145/3606274.3606278
Ferreira, D. F. Estatistica Multivariada. 2a ed. revisada e ampliada. Lavras: Editora UFLA, 2011. 676 p.
Kaufman, L. and Rousseeuw, P. J. Finding Groups in Data: An Introduction to Cluster Analysis, New York: John Wiley & Sons. 1990.
Kittler, J.; Hatef, M.; Duin, R. P. W. and Matas, J. On combining classifiers. IEEE Transactions on Pattern Analysis and Machine Intelligence. 20(3):226-239. 1998. doi: 10.1109/34.667881
Martinez, W. L.; Martinez, A. R.; Solka, J. Exploratory data analysis with MATLAB. 2nd ed. New York: Chapman & Hall/CRC, 2010. 499 p.
Mingoti, S. A. analysis de dados atraves de metodos de estatistica multivariada: uma abordagem aplicada. Belo Horizonte: UFMG, 2005. 297 p.
Nicoletti, M. do C. O modelo de aprendizado de maquina baseado em exemplares: principais caracteristicas e algoritmos. Sao Carlos: EdUFSCar, 2005. 61 p.
Onumanyi, A. J.; Molokomme, D. N.; Isaac, S. J. and Abu-Mahfouz, A. M. Autoelbow: An automatic elbow detection method for estimating the number of clusters in a dataset. Applied Sciences 12, 15. 2022. doi: 10.3390/app12157515
Rencher, A. C. Methods of multivariate analysis. 2th. ed. New York: J.Wiley, 2002. 708 p.
Rencher, A. C. and Schaalje, G. B. Linear models in statisctic. 2th. ed. New Jersey: John & Sons, 2008. 672 p.
Rousseeuw P. J. Silhouettes: A Graphical Aid to the Interpretation and Validation of Cluster Analysis. Journal of Computational and Applied Mathematics, 20:53-65. 1987. doi: 10.1016/0377-0427(87)90125-7
Sugar, C. A. and James, G. M. Finding the number of clusters in a dataset: An information-theoretic approach. Journal of the American Statistical Association, 98, 463, 750-763. 2003. doi: 10.1198/016214503000000666
Venabless, W. N. and Ripley, B. D. Modern Applied Statistics with S. Fourth edition. Springer, 2002.
Zhang, Y.; Mandziuk, J.; Quek, H. C. and Goh, W. Curvature-based method for determining the number of clusters. Inf. Sci. 415, 414-428, 2017. doi: 10.1016/j.ins.2017.05.024