Comparative Performance Evaluation Results of Classification Algorithm in Data Mining to Identify Types of Glass Based on Refractive Index and It’s Elements
PDF (Inggris)

Kata Kunci

Performance evaluation
Classification algorithm
Data mining
Glass

Abstrak

Data science is becoming familiar to the public and companies in the era of the Industrial Revolution 4.0. One part of data science is data mining. Data mining is the process of collecting information to see patterns from very large datasets and data discovery which is processed in such a way as to become knowledge based on the interpretation of the information obtained. This paper aims to compare the performance evaluation results of several classification algorithms in data mining (such as DT C-45, Neural Network, KNN, LDA, Naïve Bayes, SVM, and Rule Induction) for identifying types of glass based on its elements and the refractive index. The dataset used is a glass identification dataset from the UCI Machine Learning Repository. The results of the evaluation can be seen from the criteria like Accuracy and Kappa using 10-fold-cross validation. As a result, the K-Nearest Neighbors (KNN) algorithm has the best Accuracy and Kappa values, namely 72.90% for Accuracy and 0.632 for Kappa values. To determine the significance of the accuracy value, the T-Test method is used.

PDF (Inggris)

Referensi

Ahmed, K., & Jesmin, T. (2014). Comparative analysis of data mining classification algorithms in type-2 diabetes prediction data using weka approach. International Journal of Science and Engineering, 7(2), 155-160.

A. Kumar and A. Kumar. 2014.Glass Identification Using K-Nearest Neighbor (KNN) Algorithm. International Journal of Advanced Research in Computer Science and Software Engineering, vol. 4, no. 7.

Awopetu, O. A., & Olugbara, O. A. (2016). Fuzzy-logic based decision support system for the classification of glass samples. Measurement, 94, 218-226.

Brownlee, J. 2021. Neural Network Architectures for Beginners. Machine Learning Mastery.

Baharuddin, M. M., Azis, H., & Hasanuddin, T. (2019). Analisis Performa Metode K-Nearest Neighbor Untuk Identifikasi Jenis Kaca. ILKOM Jurnal Ilmiah, 11(3), 269-274.

Cohen, W. (2015). Fast Effective Rule Induction Machine Learning. In the Proceedings of the 12th International Conference.

Deng, Y., Qian, Y., Chen, Y., & Xie, C. 2021. A survey of support vector machine: Concepts, applications, and challenges. IEEE Access, 9, 18379-18407.

E. Karlik and O. Baykan. 2009.Feature Selection and Classification of Glass Types Using Support Vector Machines." Expert Systems with Applications, vol. 36, no. 2,

E. Prasetyo. (2014).Data Mining – Mengolah Data Menjadi Informasi menggunakan Matlab. Yogyakarta: Andi

Figueiredo, M. A., & Gama, J. 2021. Data imbalance and concept drift in support vector machines: A review. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 11(2), e1401.

Gorunescu, F. (2011). Data Mining: Concepts, models and techniques (Vol. 12). Springer Science & Business Media.

Guo, H., Huang, X., & Sun, X. 2021. An Enhanced Rule Induction Algorithm for Data Classification. Journal of Intelligent & Fuzzy Systems, 41(2), 2467-2478.

Javed, K., Gani, A., & Khan, S. U. 2021. A comprehensive review of support vector machines for data classification. Journal of Ambient Intelligence and Humanized Computing, 12(1), 101-125.

Kelleher, J. D., Tierney, B., & Tierney, B. 2018. Data science: an introduction. CRC press.

Larose, D. T., & Larose, C. D. (2014). Discovering knowledge in data: an introduction to data mining (Vol. 4). John Wiley & Sons.

Li, M., Deng, X., Du, X., Hu, J., & Yang, J. 2020. Recent Advances in Convolutional Neural Networks. IEEE Access, 8, 191298-191324.

Liu, B., Zhu, Y., & Zhang, L. 2021. A novel hybrid linear discriminant analysis approach based on manifold learning for high-dimensional data classification. Knowledge-Based Systems, 222, 106956.

Putra, P. D., & Rini, D. P. (2020, February). Prediksi Penyakit Jantung dengan Algoritma Klasifikasi. In Annual Research Seminar (ARS) (Vol. 5, No. 1, pp. 95-99).

Quinlan, J. R. (1996). Learning decision tree classifiers. ACM Computing Surveys (CSUR), 28(1), 71-72.

Raschka, S., & Mirjalili, V. 2021. Python Machine Learning, Third Edition. Packt Publishing.

Sawyer, T. L., & Kotrlik, J. W. 2021. Basic statistical concepts for single-group and two-group research designs. Journal of Agricultural Education, 62(2), 79-95.

Shrestha, R., et al. 2019. Comparative analysis of classification algorithms: Naïve Bayes, logistic regression, decision tree, random forest, gradient boosting, and XGBoost. Journal of Big Data, 6(1), 1-26.

S. Sultana and S. Islam. 2014. Glass Identification Using Decision Tree. International Journal of Computer Applications, vol. 97, no. 22,

Tan, M., & Li, J. 2021. Improved Rule Induction Algorithm for Multi-Class Classification. IEEE Access, 9, 5164-5174.

Witten, I. H., Frank, E., Hall, M. A., Pal, C. J., & DATA, M. (2005, June). Practical machine learning tools and techniques. In Data Mining (Vol. 2, No. 4).

Wu, X., & Kumar, V. (Eds.). (2009). The top ten algorithms in data mining. CRC press.

Yadav, S. K., Kumar, P., & Singh, V. K. (021. A Survey of k-Nearest Neighbor Methods for Machine Learning. Journal of Intelligent Systems, 30(1), 1-20.

Zhang, H., & Liu, Y. 2021. A Novel Naive Bayes Method Based on Mutual Information for Multi-label Classification. IEEE Access, 9, 78816-78825.

Zhang, Z., Chen, J., Wang, J., & Li, C. 2022. Multi-source feature learning with linear discriminant analysis for human activity recognition. Neurocomputing, 485, 78-89.

Authors who publish in PENA TEKNIK: Jurnal Ilmiah Ilmu-ilmu Teknik agree to the following terms:

  • Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
  • Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
  • Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.