From Figure 2 it can be seen bathrooms, sqft-lot, and
city these 3 influencing factors account for 72.87% of
the total. Therefore, the other four factors have little
impact on housing prices. This reflects that these 3
factors have a great impact on housing prices, while
the other four variables (bedrooms, floors, condition,
sqft-basement) have almost no impact on house
prices.
Table 4 shows the accuracy of the model is
obtained as after prediction on the test set, and the R-
squared for fitting the random forest model is 0.497
when testing, 0.921 when training.
Table 4: Results of Random Forest.
4 CONCLUSION
Overall, this study selected 1500 samples from all of
4120 from the data set, which has 7 variables. By
comparing the performance of the two models,
Multiple Linear Regression and Random Forest, on
the impact of house prices. Based on the multiple
linear regression, it was concluded that sqft-lot was
an insignificant factor. Therefore, a stepwise
regression model was further used to remove
inaccurate variables, and finally, effective and
accurate results were obtained. Another model
random forest analysis and obtained the Feature
Weight Graph to intuitively understand the
proportion of each variable affecting housing prices.
Comparing the R-squared values of the two methods,
the random forest has a larger value. Therefore, the
total impact of the seven factors calculated by the
random forest on housing prices is greater. As a
result, a variety of factors influence how well various
models function. In various investigations, the author
should choose the most correct model to examine
after identifying the key characteristics of the data.
With this research, people can have more
references when choosing their ideal house and have
an approximate idea of how much houses will cost.
However, most of the data in this study are from 2014,
which has certain time limitations and a small sample
size. Using control variables and locating more recent
data might enhance this analysis.
REFERENCES
Adetunji, A. B., Akande, O. N., Ajala, F. A., 2022. House
price prediction using random forest machine learning
technique. Procedia Computer Science, 199, 806-813.
Aydinoglu, A. C., Sisman, S., 2012. A modelling approach
with geographically weighted regression methods for
determining geographic variation and influencing
factors in housing price: A case in Istanbul. Land use
policy, 119, 106183.
Guo, H. P., Qian, Y. H., Zhu, Y., 2022. Statistical inference
method of factors affecting housing prices. Statistics
and Management, 37(05), 58-63.
Hamzah, N., Khoiry, M. A., Tawil, N. M., 2012. Critical
factors affecting double storey terrace houses prices in
Bandar Baru Bangi. Procedia-Social and Behavioral
Sciences, 60, 562-566.
Hochstenbach, C., Musterd, S., 2018. Gentrification and the
suburbanization of poverty: changing urban
geographies through boom and bust periods. Urban
Geog, 39, 26-53.
Liu, Y. P., Tang H. T., Wu, Z. C., 2021. Spatial
heterogeneity analysis of factors affecting housing
prices based on POI data: A case study of Changsha.
Urban Issues, 2, 95-103.
Pan, T., 2021. A review of research on factors affecting
housing prices. Guangxi Quality Supervision Herald, 2,
5-7.
Tang, W. B., 2016. An empirical analysis of the factors
affecting real estate prices in my country. Price Theory
and Practice, 1, 119-121.
Xu, H., 2022. Analysis of factors affecting housing prices
in Jiangsu Province. Market Weekly, 35(04), 46-50.
Zhang, Y., 2022. Analysis of factors affecting housing
prices in my country's megacities. University of
International Business and Economics.
Training Accuracy Test Accurac