Prediction of Factors Influencing Diabetes Prevalence: Analysis Using Machine Learning in Python

Qing Lei

2024

Abstract

Diabetes is a chronic disease caused by either the pancreas' inability to create insulin or the body's inability to use it effectively. With machine learning, scientists can anticipate diabetes. This paper used the "Diabetes Data" dataset from Kaggle for the study. Eight attributes made up the diabetes dataset, including the number of pregnancies, glucose, blood pressure, skin thickness, insulin, BMI, diabetes spectrum coefficient, and age. This project aims to apply machine learning to forecast the factors that influence diabetes prevalence. Through data preprocessing and data feature analysis, a prediction model based on KNN, naive Bayes, SVM, decision tree, random forest, lo-gistic regression and other six classification algorithms were constructed to achieve diabetes risk prediction. The research on the influencing factors of diabetes will contribute to a better comprehension of how it develops and will provide a scientific basis for establishing more effective treatment and prevention strategies, as well as assist doctors in conducting early intervention and diagnosis to reduce diabetes risk.

Download


Paper Citation


in Harvard Style

Lei Q. (2024). Prediction of Factors Influencing Diabetes Prevalence: Analysis Using Machine Learning in Python. In Proceedings of the 2nd International Conference on Data Analysis and Machine Learning - Volume 1: DAML; ISBN 978-989-758-754-2, SciTePress, pages 574-578. DOI: 10.5220/0013528600004619


in Bibtex Style

@conference{daml24,
author={Qing Lei},
title={Prediction of Factors Influencing Diabetes Prevalence: Analysis Using Machine Learning in Python},
booktitle={Proceedings of the 2nd International Conference on Data Analysis and Machine Learning - Volume 1: DAML},
year={2024},
pages={574-578},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013528600004619},
isbn={978-989-758-754-2},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 2nd International Conference on Data Analysis and Machine Learning - Volume 1: DAML
TI - Prediction of Factors Influencing Diabetes Prevalence: Analysis Using Machine Learning in Python
SN - 978-989-758-754-2
AU - Lei Q.
PY - 2024
SP - 574
EP - 578
DO - 10.5220/0013528600004619
PB - SciTePress