Table 1: Metrics of the three models.
LSTM XGBoost RF
𝑅
0.9548 0.6541 0.9362
MAE 0.0400 0.1009 0.0459
RMSE 0.0509 0.1407 0.0604
Since normalization has been applied, the
numbers in the table shown above are in the range
between 0 and 1 to enhance direct comparison
between different metrics and models. However, this
excellent performance of the LSTM model may be
attributed to overfitting due to the large noise under
the Bitcoin close prices. This is nearly unavoidable
when fitting the price of cryptocurrencies since the
market is volatile. Cross-validation could be utilized
to mitigate this problem by splitting the dataset into
multiple folds and evaluating it.
The data quality itself also determines the
reliability of the procedure data preprocessing and
eliminates null values. Hence, it is crucial to have a
relatively clear dataset to perform the experiment,
which will make the whole experimental process
more stable and reduce noise to a certain level.
These results provide valuable insights for
cryptocurrency investors and market analysts, with
LSTM shown to be a preferred model experimentally.
However, it is imperative to note the fact that there
are significant differences between the simulation and
the trading market in real life: real transactions in the
market are unexpected and may involve various
unknown complexities that are not simulated by the
models. As a result, people should be more rational
when facing the near-perfect alignment of the model
simulation and considering firm orders. Also, risk
management and legal problems in the financial
market should be realized. Apart from those sides, the
results still possess significant implications in
analyzing the cryptocurrency trading market.
5 DISCUSSIONS
According to the results presented in the study,
researchers should continue to develop new
techniques to improve the accuracy of the predictions.
One of the most prevalent methods for this aim is
called feature engineering, which is to include more
relevant features of the Bitcoin prices and weight
them in different ratios. This includes adding
technical indicators or economic factors to increase
the level of fitting and reduce the error. However, it
is hard to identify the level of priority of the features
in the dataset, and the dynamic trading market may
change the importance of each feature, which requires
consistent updates of peoples’ own feature sets.
Another method of improving the precision is to
incorporate deep learning algorithms such as
Convolutional Neural Networks (CNNs) or
Generative Adversarial Networks (GANs) to
simulate the data precisely (Kattenborn et al., 2021).
Because of the introduction of non-linearity, the
activation functions inherited in CNNs enhance the
versatility and capability of the networks to model a
broad range of complicated tasks exhibited in reality
(Krichen, 2023).
Cross-sectional predictions are also an alternative
approach to predicting the price of cryptocurrencies:
instead of predicting the close price of the target
currency directly, they focus on analyzing the market
variables of the currency at a specific moment
(Hanauer & Kalsbach, 2023). This method could limit
the effect of the outliers and address the impact of
differences in the characteristics of the target
(Hanauer & Kalsbach, 2023). Thus, prediction
accuracy and profitability can be enhanced by
applying non-linear combinations through deep
learning techniques, rather than relying merely on
linear regression to combine various factors (Abe &
Nakagawa, 2020).
6 CONCLUSIONS
In this study, the Bitcoin price is predicted by LSTM,
XGBoost, and RF models. The paper initially selected
the close price as the target variable and chose a
specific period from the data time. Then the three
models carried out the task with different
performances shown above. Finally, they are assessed
by multiple metrics and graphs, which demonstrate
the best comprehensive quality of the LSTM model.
Generally, people have been dedicated to reforming
various methods of predicting cryptocurrency prices
these years, which implies the importance of accurate
forecasting in both commercial and scientific areas.
Other methods or refinements should be explored to
optimize the actual capability of the models and to
achieve more reliable predictions in the dynamic
trading market.