
the experimental setup. The last section contains the
overview and recommendations for future work.
2 RESEARCH AND LITERATURE
REVIEW
2.1 Research Premise
At the Central Economic Work Conference held in
Beijing from December 14 to 16, 2016, the "house is
for living in, not for speculation" policy was proposed,
clarifying the residential attributes of China's real
estate and aimed to curb speculative property
purchases. At that time, China's real estate market
faced an out-of-control situation, and many
speculators unquestioningly speculated on real estate.
The proposal of this policy laid the foundation for the
formulation of subsequent real estate control policies.
Measures like purchase restrictions, lending
restrictions, and price caps are just a few of the laws
and guidelines that the Chinese and local
governments have successfully implemented since
2016 to control and regulate housing prices
nationwide. Many cities have steadily adopted the
policies restricting purchases and loans over time.
The deleveraging and monetary resettlement aspects
of the shed reform have also been executed
successfully. In 2020, the People's Bank of China and
the Ministry of Housing and Urban-Rural
Development jointly launched the "three red lines"
policy. These regulations, which were put into place
gradually, have successfully stopped the major cities'
real estate markets from overheating. Apart from the
execution of policies, China's economy has
experienced a transition from rapid expansion to
moderate expansion since 2016. As a result, there is
now less demand for real estate on the market.
However, a mismatch in the supply and demand
structure has allowed several first and second tier
cities to maintain the increase in real estate prices. In
contrast, third - and fourth-tier cities have declined
due to oversupply. Many studies often choose a single
model for the existing research on Chinese housing
price forecasting, ignoring the possibility of multi-
model comparison and integration, resulting in
insufficient forecasting accuracy. In addition, most
existing studies focus on a city or a province and do
not fully consider the impact of regional differences
in housing price changes in different regions across
the country. Therefore, this study focuses on the
changes in housing prices in other areas in China and
comprehensively considers the differences between
various regions to analyze and forecast. To improve
the accuracy and reliability of home price estimates,
the research combines GDP and housing price data
from multiple Chinese regions with a variety of
machine learning models. Four models - XGBoost,
SVR, MLP, and LSTM - are compared in order to
identify which performs best in predicting the trend
of property prices in various regions over the
following few years. The findings of this study have
the potential to be useful in regional planning, real
estate investment, and policy formation. They also
contribute to the improvement of the national housing
price projection.
2.2 Literature Review
In the past, scholars have created many outstanding
results in studying machine learning housing price
prediction. The suitability of the random forest model
for predicting housing prices was confirmed by
Adetunji et al. (2020). According to Chen et al.
(2021), Bayes, support vector machines, and
backpropagation neural networks are better options
for predicting home prices. According to Goel et al.
(2023), the LSSVM model outperforms SVM, CNN,
and other models in predicting home prices.
Henriksson and Werlinder (2021) discovered that
random forests, which take a long time to train and
infer, fared worse on small and big data sets than
XGBoost. After analysing and contrasting the SVM,
random forest, and GBM algorithms, Ho et al. (2020)
came to the conclusion that while SVM can yield
remarkably accurate predictions, random forest and
GBM perform better. The random forest model can
fully capture the complexity and nonlinearity of the
actual housing market, as demonstrated by the study
by Hong et al. (2020), which shows that the average
percentage deviation between the predicted and
actual market prices is only 5.5%, and that the
probability that the expected price is within 5% of the
actual market price is 72%. In order to create a home
price evaluation and prediction model based on
factors influencing prices, Manasa et al. (2020) took
inspiration from the multiple linear regression models,
the Lasso regression model, the Ridge regression
model, the support vector machine model, and the
XGBoost model. By contrasting the model mistakes,
they were able to choose the best model. Ming et al.
(2020) investigated rent prices in Chengdu, China,
using three machine learning models, and discovered
that XGBoost was the most accurate in forecasting
them. According to Sheng and Yu (2020), the
comprehensive learning algorithm predicts more
accurately than a linear regression model, and the
ECAI 2024 - International Conference on E-commerce and Artificial Intelligence
258