Analysis of Dimensional Reduction Effect on K-Nearest Neighbor Classification Method
Date
2021-10Author
Taufiqurrahman
Nababan, Erna Budhiarti
Efendi, Syahril
Metadata
Perlihat publikasi penuhAbstract
Classification algorithms mostly become problematic on data with high dimensions, resulting in a decrease in classification accuracy. One method that allows classification algorithms to work faster and more effectively and improve the accuracy and performance of a classification algorithm is by dimensional reduction. In the process of classifying data with the K-Nearest Neighbor algorithm, it is possible to have features that do not have a matching value in classifying, so dimension reduction is required. In this study, the dimension reduction method used is Linear Discriminant Analysis and Principal Component Analysis and classification process using KNN, then analyzed its performance using Matrix Confusion. The datasets used in this study are Arrhythmia, ISOLET, and CNAE-9 obtained from UCI Machine Learning Repository. Based on the results, the performance of classifiers with LDA is better than with PCA on datasets with more than 100 attributes. Arrhythmia datasets can improve performance on K-NN K=3 and K=5. The best performance is obtained by LDA+K-NN K=3 which produces an accuracy value of 98.53%, the lowest performance found in K-NN without reduction with K=3. ISOLET datasets, the best performance results are also obtained by data that has been reduced with LDA, but the best performance is obtained when the classification of K-NN with K=5 and the lowest performance is found in PCA+ K-NN with a value of K=3. As for the best performance, dataset CNAE-9 is also achieved by LDA+K-NN, while the lowest performance is PCA+K-NN with the value of K=3.