A Machine-Learning, Predictive-Analytical Model for Thyroid-Cancer Risk Assessment
Sanjay Manda, Manohar Adapa, Harsha Sai Jasty, Rishma Sree Pathakamuri, Siddhartha Vinnakota, Bonaventure Chidube Molokwu
2025
Abstract
Thyroid cancer is a significant health problem globally due to the increasing number of people being diagnosed, while existing methods to diagnose it heavily rely on invasive biopsies and imaging that fail to account for various patient risk factors. This research aims to develop a comprehensive and precise model to forecast thyroid cancer risk through the application of state-of-the-art machine learning techniques. We utilized a number of preprocessing methods such as imputation of missing values, outlier detection, categorical feature encoding, and the Synthetic Minority Oversampling Technique (SMOTE) to address class imbalance. We utilized advanced feature engineering methods such as polynomial transformation, logarithmic scaling, and clinical risk scoring to extract important predictive patterns. Our model was thoroughly tested using the CatBoost (Categorical Boosting) algorithm against other algorithms (Logistic Regression, Random Forest, XGBoost, and LightGBM). The CatBoost model showed outstanding prediction performance with 88% accuracy, 93% precision, 78% recall, 85% F1-score, and ROC-AUC of 90%. These findings suggest that CatBoost can differentiate well between thyroid cancer high-risk and low-risk cases. This robust prediction model identifies individuals at risk early and accurately, assists in making informed clinical decisions, and could reduce healthcare expenditure and prevent futile treatment, improving patient quality of life.
DownloadPaper Citation
in Harvard Style
Manda S., Adapa M., Jasty H., Pathakamuri R., Vinnakota S. and Molokwu B. (2025). A Machine-Learning, Predictive-Analytical Model for Thyroid-Cancer Risk Assessment. In Proceedings of the 21st International Conference on Web Information Systems and Technologies - Volume 1: WEBIST; ISBN 978-989-758-772-6, SciTePress, pages 350-357. DOI: 10.5220/0013692200003985
in Bibtex Style
@conference{webist25,
author={Sanjay Manda and Manohar Adapa and Harsha Jasty and Rishma Pathakamuri and Siddhartha Vinnakota and Bonaventure Molokwu},
title={A Machine-Learning, Predictive-Analytical Model for Thyroid-Cancer Risk Assessment},
booktitle={Proceedings of the 21st International Conference on Web Information Systems and Technologies - Volume 1: WEBIST},
year={2025},
pages={350-357},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013692200003985},
isbn={978-989-758-772-6},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 21st International Conference on Web Information Systems and Technologies - Volume 1: WEBIST
TI - A Machine-Learning, Predictive-Analytical Model for Thyroid-Cancer Risk Assessment
SN - 978-989-758-772-6
AU - Manda S.
AU - Adapa M.
AU - Jasty H.
AU - Pathakamuri R.
AU - Vinnakota S.
AU - Molokwu B.
PY - 2025
SP - 350
EP - 357
DO - 10.5220/0013692200003985
PB - SciTePress