
analyze clinical and diagnostic data to assess a
patient's risk. These models include Decision Trees,
Support Vector Machines, and Logistic Regression,
among others. They are widely used due to their
interpretability and effectiveness in handling
structured medical data.
A Decision Tree is a flowchart-like structure
commonly used for classification and regression tasks
and belongs to a nonparametric supervised learning
algorithm. The root node, branches, internal nodes,
and leaf nodes make up its hierarchical tree structure.
To generate predictions, they recursively divide the
dataset according to feature values. The decision tree
will start from the root node without any branches at
the root node. Whereas, the branches from the root
node flow into the internal nodes which are also
known as decision nodes. Both node types are
assessed to create a homogeneous subset based on the
features that are accessible, and are represented as
either terminal or leaf nodes. All potential outcomes
in the data set are thus represented by leaf nodes (IBM,
2025). Decision trees are also frequently utilized for
the prediction of heart disease because of their
interpretability. However, since decision trees have a
tendency to overfit, it is necessary to carefully tune
them. Furthermore, by forecasting the likelihood of
results, occurrences, or observations, logistic
regression provides a straightforward and efficient
statistical technique for binary classification tasks.
Using a logistic function that limits the output to
values between 0 and 1, the model simulates the
likelihood that a given input falls into a particular
category. Data is categorized into distinct groups
using logistic regression, which examines the
relationship between one or more independent
variables (Singh & Kumar, 2020). Due to its
importance in predictive modeling, which calculates
the statistical likelihood that an occurrence falls into
a particular category, this prediction technique is
frequently used in heart disease research. However,
when handling nonlinear relationships in clinical data,
its performance might be constrained.
A supervised machine learning technique
commonly used for classification and regression tasks
is the Support Vector Machine (SVM). It classifies
data points by determining the optimal hyperplane in
an N-dimensional space and maximizing the
separation between the closest points of different
classes. By defining the maximum margin, these
closest points, also referred to as support vectors,
improve classification accuracy and the model's
potential to generalize to new data (IBM, 2024). Due
to its robustness in high-dimensional spaces and
effectiveness in handling both linear and nonlinear
classification tasks, SVM has been extensively
applied in cardiology prediction, making it
particularly suitable for medical datasets with
numerous features. Comparable to this, the K-Nearest
Neighbor (KNN) algorithm is a simple, instance-
based, supervised, and nonparametric machine
learning method that classifies or predicts outcomes
based on the proximity of data points. It is commonly
used in classification and regression assignments due
to its ease of implementation and efficacy. By
calculating the distance between the input data point
and other points, the method finds the K nearest
neighbors. The average or weighted average of these
neighbors' goal values is used by regression to predict
the value, whereas classification places the input data
point in the most prevalent category among its K
nearest neighbors (Srivastava, 2025). However, the
distance measure and K selection affect KNN
performance, therefore parameter adjustment is
necessary for best outcomes.
A dataset from the UCI database was used for
training and testing in a study on machine learning
algorithms for heart disease prediction. The accuracy
of diagnosing cardiac disease was assessed and
predicted using a variety of computational machine
learning models. These algorithms included the K-
nearest neighbor algorithm, the decision tree
algorithm, the linear regression algorithm, and the
support vector machine algorithm (Singh & Kumar,
2020). Using a variety of machine learning
algorithms, such as logistic regression and KNN, the
authors of another article developed a system for
predicting and classifying patients with heart disease.
The system also uses a patient's medical history to
assess the likelihood that the patient will be diagnosed
with heart disease (Das & Biswas, 2018). Mythili and
her team introduce a rule-based model that evaluates
the effectiveness of applying rules to the individual
predictions generated by logistic regression, decision
trees, and support vector machines. By incorporating
these machine learning approaches into a more
accurate predictive model, their method seeks to
improve the accuracy of heart disease prediction
using the Cleveland Heart Disease Database (Mythili
et al., 2013). From these studies, it is clear that these
four models are used very frequently in heart disease
prediction research.
3.2 Novel Models
In addition to the traditional models mentioned above,
which have been adopted by many studies, there are
several studies that have been drilling into new types
of models. Most of these new forecasting methods are
Traditional and Novel Predictive Models of the Heart Disease
357