Authors:
Kartika Handayani
1
;
Erni Erni
1
;
Rangga Pebrianto
1
;
Ari Abdilah
1
;
Rifky Permana
1
and
Eni Pudjiarti
2
Affiliations:
1
Universitas Bina Sarana Informatika, Jakarta, Indonesia
;
2
Universitas Nusa Mandiri, Jakarta, Indonesia
Keyword(s):
Breast Cancer, Light Gradient Boosting, Resampling, Hyperparameter Tuning.
Abstract:
Breast cancer is the most frequently diagnosed cancer and the leading cause of death. The main cause of breast cancer is mainly related to patients who inherit genetic mutations in genes. Early diagnosis of breast cancer patients is very important to prevent the rapid development of breast cancer apart from the evolution of preventive procedures. A machine learning (ML) approach can be used for early diagnosis of breast cancer. In this study, testing was performed using the Wisconsin Diagnostic Breast Cancer Dataset, also known as WDBC (Diagnostics) which consists of 569 instances with no missing values and has one target class attribute, either benign (B) or malignant (M). Tests were carried out using the ROS, RUS, SMOTE, and SMOTE-Tomek resampling techniques to see the effect of overcoming unbalanced data. Then tested with Light Gradient Boosting and optimized to get the best results using hyperparameter tuning. The best results are obtained after tuning the hyperparameter with acc
uracy 99.12%, recall 99.12%, precisions 99.13%, f1-score 99.13% and AUC 0.988.
(More)