new approach to style transfer that employs CNNs 
(Miah, 2023); Gauri et al. summarized the research on 
using artificial intelligence technologies to filter, 
diagnose, monitor, and disseminate information 
about COVID-19 through human audio signals. 
This overview will help develop automated 
systems to support COVID-19 related efforts to 
utilize non-invasive and user-friendly biosignals in 
human non-verbal and verbal audio (Deshpande, 
2022); One of the most important directions for this 
is the prediction of music popularity, and here are 
some examples: HuaFeng et al. developed a model for 
predicting song popularity that combines multimodal 
feature fusion with LightGBM. The model consists of 
a LightGBM component, a multimodal feature 
extraction framework and a logistic regression 
component (Zeng, 2022); Notably, the research by 
Seon et al. empirically examined how acoustic 
features enhance the likelihood of songs reaching the 
top 10 on the Billboard Hot 100, analyzing data from 
6,209 unique songs that appeared on the chart 
between 1998 and 2016, with a particular emphasis 
on acoustic features supplied by Spotify (Kim, 2021); 
In the research by Bang Dang et al., the paper focuses 
on predicting the rankings of popular songs for the 
next six months. The dataset, used for the Hit Song 
Prediction problem in the Zalo AI Challenge 2019, 
includes not only songs but also details like 
composers, artist names, release dates, and more. The 
paper advocates for treating hit song prediction as a 
ranking problem using Gradient Boosting techniques, 
rather than the typical regression or classification 
methods employed in previous studies. The optimal 
model demonstrated strong performance in predicting 
whether a song would become a top Ten dance hit 
versus lower-ranked positions (Pham, 2020). 
Thanks to the robust development in this field, 
this paper also aims to employ AI algorithms for 
popularity prediction. To achieve this objective, the 
study utilizes extensive streaming data, including 
official metrics such as Spotify's track play counts 
and datasets from Kaggle relevant to the model. 
Experimental results validate the effectiveness of the 
proposed methods. 
2 METHODS 
2.1  Dataset Preparation 
The Dataset which this paper picked was a Spotify 
Songs dataset that recorded 114,000 songs with their 
popularity, artists, genre, duration, etc. 
These features can be used to predict a song's 
popularity and also to explore how these features 
influence that popularity. Additionally, this study 
conducted an online search for streaming play counts 
and popularity data for singles from 2004 to 2024. To 
account for regional differences, data was collected 
primarily from Spotify, YouTube Music, and QQ 
Music. These datasets were used as another critical 
source of information. Utilizing these datasets, the 
study conducts a regression task to examine the 
relationship between play counts and a song's 
popularity. 
After cleaning the data, this paper selected 
features that were not popularity to become the 
independent variables. Then, were selected only the 
popularity as out dependent variable since its the 
target that this study aims to predict. In terms of data 
preprocessing, this paper conducted normalization 
training. To properly evaluate the model's 
performance, it's important to split the dataset into 
training and testing sets. This paper makes use of the 
"train-test-split" function from the 
“sklearn.model_selection” module, allocating 80% of 
the data to the training set and 20% to the testing set. 
2.2  Machine Learning Models-Based 
Prediction 
About the models this study chosen, this paper 
selected three different models. They are Random 
Forest Regressor(RF),Gradient Boosting Machines 
(GBM) and Simple Linear Regression. 
2.2.1 Random Forest 
Firstly, RF shown in Figure 1 is an ensemble method 
that constructs multiple decision trees and merges 
their results. It leverages bootstrapping and feature 
randomness to enhance model performance and 
reduce overfitting. It's Methodology including 
Ensemble Construction which generates multiple 
decision trees using bootstrap samples from the 
training data. Besides, each tree is trained on a unique 
subset of the data, which aids in minimizing variance 
and preventing overfitting. The reasons of why this 
study chose it are as follows: 1. powerful ensemble 
learning method; 2. It is capable of effectively 
handling both linear and non-linear relationships; 3. it 
offers robustness against overfitting, especially in 
datasets with many features.