learning algorithms. Roberts et. al., 2020 produced a
dataset for time-scale modification (TSM) with
subjective quality descriptors, spanning a wide range
of music genres. This dataset serves as a great
resource for obtaining objective measurements of
quality in TSM. Zhao et. al., 2020 proposed Music
Oder, a universal music-acoustic encoder based on
transformers, which outperformed existing models in
music genre categorization and auto-tagging tasks.
These breakthroughs in dataset development and
acoustic representation learning lead to the
improvement of music genre classification systems.
Cuesta et. al., 2020 focused on multiple F0 estimation
in voice ensembles using convolutional neural
networks, demonstrating the usefulness of CNNs in
varied circumstances and data configurations. Lerch,
2020 examined audio content analysis in the context
of music information retrieval systems, stressing
music genre classification as one of the primary
applications of audio content analysis. The study by
Ozan, 2021 uses convolutional recurrent neural
networks (CRNN) for audio segment categorization
in contact centre records, drawing parallels to music
genre classification problems’ et. al., 2021 extended
the usage of deep learning to electronic dance music
(EDM) subgenre categorization, including tempo-
related feature representations for better classification
accuracy. Muñoz-Romero et. al., 2021 studied
nonnegative orthogonal partial least squares (OPLS)
for supervised design of filter banks, with
applications in texture and music genre
categorization. These studies demonstrate the diverse
methodologies and techniques utilized in music genre
classification research. Zhao et. al., 2022 created a
self-supervised pre-training technique with Swim
Transformer for music categorization, underlining the
importance of learning meaningful music
representations from unlabelled data. Chak et. al.,
2022 presented the use of Generalized Morse
Wavelets (GMWs) in the Scattering Transform
Network (STN) for music genre categorization,
proving the superiority of this approach over
conventional methods. These works illustrate the
continual study of fresh strategies to boost music
genre classification accuracy. Ian, 2022 focuses on
optimizing musician impact models and assessing
musical qualities across different genres, highlighting
the necessity of recognizing the influence of
performers in music categorization tasks. Liu et. al.,
2022 discussed open set recognition for music genre
classification and presented an algorithmic approach
towards the segmentation of known as well as
unknown genre classes. Heo et. al., 2022 proposed a
framework for hierarchical feature extraction and
aggregation in the classification of music genres so
that short and long-term musical features are captured
appropriately.
3 METHODOLOGY
The music genre categorization experiment utilized a
dataset of 114,000 tracks collected from the Spotify
API, which comprised audio attributes and
information spanning 125 categories. The dataset was
partitioned into a training set of about 91,200 tracks
and a test set consisting of about 22,800 tracks to
evaluate model performance. The independent
variables comprised numerous audio parameters such
as danceability, energy, loudness, acoustics, pace, and
categorization features like key and time signature,
whereas the dependent variable was the track genre.
The dataset is realistic and interesting due to its
significant size, wide genre representation, and rich
audio features that mirror real-world music
categorization issues. This intricacy is further
highlighted by the overlap between genres, which
often share similar auditory traits. This poses a major
challenge to effective classification in the context of
machine learning. These characteristics not only
increase the model's applicability to real-world
scenarios in music streaming services, but they also
provide an interesting exploration of the intricacy of
music, illuminating the challenges associated with
genre classification and the effectiveness of
algorithms in this regard.
Algorithm: Music Genre Classification Using Neural
Networks
Data Preparation
1. Load Dataset: Import dataset D comprisinging
audio features X and genre labels y.
2. Clean Dataset:
o Remove duplicate tracks to maintain
uniqueness,eliminate non-sound-based
genres (e.g., language categories) and
extraneous attributes (e.g., track IDs).
o Implement One-hot encoding for
categorical features (e.g., key, time
signature).
o Convert boolean features (e.g., explicit
content) to binary values (0/1).
3. Split Data: Divide D into training (Xtrain,ytrain)
and test sets (Xtest,ytest) using test split ratio τ.
4. Normalize Features: Apply StandardScaler to
numerical features in Xtrain and Xtest.
5. Consolidate Genres:
o Perform hierarchical clustering on ytrain
and ytest using Ward’s method and
Euclidean distance.