features is made easy with the Jellyfish algorithm's
fast convergence speed and flexibility.
According to the World Health Organization,
cardiac-related diseases have increased. Therefore,
each year 17.9 million people die (
Vijeta Sharma, et al,
2020). Detecting and treating these patients earlier is
getting more difficult with the growing population. On
the other hand, many studies have shown that the
recent growth in technology has caused machine
learning techniques to accelerate the health-care field.
Hence, the purpose of this work is to build a machine
learning model for the prediction of heart disease
utilizing these significant characteristics. The heart
disease prediction dataset at UCI served as a standard
for this research; it contains fourteen separate
characteristics related to cardiovascular disease.
While building the model, many machine learning
approaches were employed, including Decision Tree,
Naive Bayes, Support Vector Machine (SVM), and
Random Forest. As part of our study, we utilized
traditional Machine Learning methods to identify
correlations between the dataset's many properties,
with the goal of applying these findings to the
prediction of heart disease risk. As compared to other
ML approaches, Random Forest provides more
accurate predictions in less time, according to the
results. As a decision-support system, this model can
be useful for doctors in the clinic.
In the last several decades, cardiovascular illness
(heart disease) has become the leading cause of
mortality worldwide (
Devansh Shah, et al., 2020). It
includes a broad variety of cardiac conditions. There
are a lot of things that may go wrong with a heart
attack, and it's critical that we find ways to diagnose
the condition quickly so that we can start treating it
effectively. Healthcare organizations often use data
mining as a method for coping with large data sets. In
order to aid medical professionals in the prediction of
heart illness, researchers examine large medical data
sets using various data mining and machine learning
methods. This research study's model exhibits several
characteristics linked to heart illness; it is constructed
using supervised learning techniques such Naïve
Bayes, decision trees, K-nearest neighbors, and
random forest. It draws on the Cleveland database at
UCI, which already has information on people with
cardiac disease. With 303 cases and 76 characteristics,
the data is rather extensive. We can actually evaluate
fourteen of those seventy-six attributes the ones that
matter most for comparing algorithm performance.
This study aims to assess the potential occurrence of
cardiovascular disease in individuals. The results
demonstrate that K-nearest neighbor offers the highest
level of accuracy.
Important medical duties include cardiovascular
disease diagnosis and prognosis to guarantee accurate
categorization, which aids cardiologists in treating
patients appropriately (
Chintan M. Bhatt, et al., 2023).
The ability of machine learning to identify patterns in
data has led to an upsurge in its use in the medical
field. To help diagnosticians decrease misdiagnosis,
machine learning may be used to categorize the
occurrence of cardiovascular illness. In an effort to
lower the death toll from cardiovascular disorders, this
study builds a model that can accurately forecast these
conditions. In order to enhance classification
accuracy, this research suggests a k-modes clustering
algorithm that starts with Huang. We employ models
like XGBoost, multilayer perceptron, decision tree
classifier and random forest. In order to get the best
possible outcome, the parameters of the applied model
were hyper-tuned using GridSearchCV. We test the
suggested model on a Kaggle dataset with 70,000 real-
world examples. Here is how the models were trained
using an 80:20 split of data and how they attained
accuracy: In the decision tree model, 86.37% of the
trials used cross-validation, while 86.53% did not. In
the XGBoost model, 87.12% of the trials used cross-
validation, while 87.05% used random forest. In the
multilayer perceptron model, 87.28% used cross-
validation, while 86.44% used non-validation. The
models that have been suggested have AUC (area
under the curve) values: XGBoost: 0.95, decision tree:
0.94, random forest: 0.95, multilayer perceptron: 0.95.
Based on these foundational studies, we know that
multilayer perceptron with cross-validation is the most
accurate method currently available. With an accuracy
of 87.28%, it was the most accurate.
3 METHODOLOGY
The significance of AI-based heart disease prediction
is Early prediction of heart disease even before people
develop any serious symptoms. When it is diagnosed
early, patients may have a better treatment experience
and better outcomes than if diagnosed late. AI can
assist doctors in more accurately assessing a person’s
risk for heart disease by analysing a patient’s history
and other lifestyle factors. This technology will also
improve the effect of health care, as it can accelerate
the initial assessment of risk, and enable providers to
focus on patients determined to have the greatest
need. This technology is integrated seamlessly within
healthcare systems, would provide efficiencies for
hospitals and doctors. With AI system powered
heart disease prediction services, more loved ones can
be reached at varying environmental conditions