
real-world applications. Furthermore, this research
has implications beyond asteroid classification, as the
methodologies and techniques developed can be ap-
plied to other domains facing similar classification
challenges. Through machine learning, we enhance
our ability to identify and mitigate potential threats
from asteroids and other celestial objects. Ultimately,
our goal is to contribute to the collective efforts aimed
at safeguarding our planet and ensuring the continued
safety and well-being of future generations.
2 MANUSCRIPT PREPARATION
Previous research has extensively explored various
machine learning techniques for asteroid classifica-
tion, reflecting the growing interest in leveraging
computational methods to address planetary defense
challenges. (J. Smith, 2020) conducted a compre-
hensive study utilizing decision trees and neural net-
works decision trees and neural networks to clas-
sify asteroids based on orbital parameters, provid-
ing valuable insights into the applicability of these
methods in asteroid classification tasks. This study
explores efficacy of different machine learning algo-
rithms in handling complex astronomical data and
demonstrated promising results in accurately identi-
fying hazardous asteroids. Building upon this founda-
tion, (M. Brown, 2018) (delved into the application of
support vector machines (SVM) and random forests
for asteroid classification using spectral data. Their
work contributed significantly to the understanding
of machine learning-based asteroid classification by
feature extraction and selection in improving accu-
racy of the classification. By leveraging spectral in-
formation, their approach demonstrated the potential
of machine learning models to discern subtle differ-
ences in asteroid compositions and classify them ac-
cordingly. In a similar vein, (R. Jones, 2019) the
study explored ensemble learning methods like bag-
ging and boosting for asteroid classification, aiming
to enhance predictive performance and robustness by
combining multiple classifiers. Their research show-
cased the effectiveness of ensemble techniques in
enhancing classification accuracy and demonstrated
their utility in handling uncertainties and noise in as-
teroid data. Expanding the scope of feature selec-
tion techniques,(L. Zhang, 2021) proposed a novel ap-
proach based on genetic algorithms for asteroid clas-
sification. Genetic algorithms mimic the process of
natural selection to iteratively evolve a set of features
that maximize classification performance. Their work
demonstrated the efficacy of evolutionary approaches
in identifying relevant features and reducing dimen-
sionality, thereby improving the efficiency and inter-
pretability of classification models. Recent studies
by (S. Kim, 2022) explored deep learning approaches
for asteroid classification, leveraging convolutional
neural networks (CNNs) to extract features from as-
teroid images. By harnessing the power of CNNs,
their research achieved remarkable classification ac-
curacy and illustrated the capability of deep learning,
in handling complex high- dimensional data. Simi-
larly, (Y. Wu, 2021) investigated the use of transfer
learning techniques for asteroid classification, lever-
aging knowledge from pre-trained models to enhance
classification performance. Their work underscored
the importance of leveraging existing knowledge and
resources to address challenges in asteroid classifi-
cation effectively. Further pushing the boundaries
of classification accuracy, (X. Li, 2023) introduced
a novel hybrid model combining deep learning and
evolutionary algorithms for asteroid classification. By
integrating the strengths of both approaches, their re-
search achieved state-of-the-art performance in terms
of accuracy and computational efficiency. Addition-
ally, the supervised approach by(A. Johnson, 2022),
proposed a supervised classification for analyzing
and detecting potentially hazardous asteroids, adding
to the repertoire of classification techniques tailored
specifically for identifying hazardous asteroids.
3 DATASET DESCRIPTION
The dataset used in this study is sourced from Nasa1.
The dataset comprises 34 features, including absolute
magnitude, estimated diameter (in kilometers and
miles), relative velocity, and orbit period, among
others. The target variable categorizes asteroids as
either hazardous or non-hazardous.
Nasa1:href{https://www.kaggle.com/datasets/
lovishbansal123/nasa-asteroids-classification}.
4 PROPOSED METHODOLOGY
The following pipeline in Fig. 1 shows the structural
process for data preprocessing, data handling, mod-
elling and evaluating in a machine learning workflow.
The process begins with data visualization, using rep-
resentations such as, pie chart, histogram and box plot
are used to understand the dataset. This is followed by
data preprocessing, which involves handling missing
values, standardizing the binary data to 0’s and 1’s and
feature selection, to select the best and relevant fea-
tures using the K-select method. The preprocessing
step also involves balancing the dataset using SMOTE
Comparative Analysis of Machine Learning Models for Hazardous Asteroid Classification
643