Optimizing the Classification of SNBP Student Candidate Files Using the K-Nearest Neighbor (KNN) Algorithm

Diah Ayu Larasati, Amil Ahmad Ilham, Ady Wahyudi Paundu

2025

Abstract

In addition to academic achievement, non-academic achievement is one of the critical factors that can increase the chances of being accepted into the SNBP (Seleksi Nasional Berdasarkan Prestasi) pathway. There are several criteria for non-academic achievement certificates that are used as assessment criteria according to the portfolio from the Ministry of Education and Culture. This research aims to optimize the classification of each certificate according to its grouping in the portfolio to make it easier for reviewers to check files. The KNN algorithm's evaluation using the cross-validation method and evaluation metrics such as precision, recall, and F1-score shows that text preprocessing has a significant influence on model performance. The best experiment, experiment 3, which uses complete preprocessing including stopword removal, gives the highest accuracy of 0.722. The lowest performance was consistently found in the “Undefined” class, with an F1-score of only 0.39 in the 3rd experiment. These results show that the KNN method with TF-IDF vectorization and cosine similarity can classify proof of achievement well.

Download


Paper Citation


in Harvard Style

Larasati D., Ilham A. and Paundu A. (2025). Optimizing the Classification of SNBP Student Candidate Files Using the K-Nearest Neighbor (KNN) Algorithm. In Proceedings of the 1st International Conference on Research and Innovations in Information and Engineering Technology - Volume 1: RITECH; ISBN 978-989-758-784-9, SciTePress, pages 231-237. DOI: 10.5220/0014276800004928


in Bibtex Style

@conference{ritech25,
author={Diah Ayu Larasati and Amil Ahmad Ilham and Ady Wahyudi Paundu},
title={Optimizing the Classification of SNBP Student Candidate Files Using the K-Nearest Neighbor (KNN) Algorithm},
booktitle={Proceedings of the 1st International Conference on Research and Innovations in Information and Engineering Technology - Volume 1: RITECH},
year={2025},
pages={231-237},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0014276800004928},
isbn={978-989-758-784-9},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 1st International Conference on Research and Innovations in Information and Engineering Technology - Volume 1: RITECH
TI - Optimizing the Classification of SNBP Student Candidate Files Using the K-Nearest Neighbor (KNN) Algorithm
SN - 978-989-758-784-9
AU - Larasati D.
AU - Ilham A.
AU - Paundu A.
PY - 2025
SP - 231
EP - 237
DO - 10.5220/0014276800004928
PB - SciTePress