An Empirical Study on Chinese Commercial Banks Based on Text Mining and a Two-Tier DEA Framework
Jiaye Fu
2025
Abstract
In the context of digital economy, how data factors affect bank efficiency has become the core issue of financial reform. This study takes five major commercial banks in China from 2019-2023 as samples, quantifies the digitalization process of banks by constructing a Data Element Ecosystem Index (DEI), extracts the frequency of 50 keywords in the annual report based on text mining technology, and combines policy-oriented weight allocation with an improved TF-IDF algorithm. Form a standardized evaluation system. The two-layer analysis framework is further adopted. Firstly, the three-stage DEA-Malmquist model is used to decompose the total factor productivity, and it is found that DEI is significantly positively correlated with asset scale and net profit, and negatively correlated with the non-performing loan ratio, which confirms the promoting effect of data factors on efficiency. Secondly, the dynamic panel threshold model is used to test the nonlinear effect, and the results show that there is a "scale threshold" for data element accumulation. The study found that the application of data elements among state-owned large banks is significantly differentiated, such as the Bank of China's DEI average (51.48) far exceeds that of its peers, and its blockchain technology word frequency (56 times in 2021) highlights strategic differences. This study provides an empirical basis for optimizing the allocation of data elements, and suggests that regulators implement differentiated policies, strengthen data governance compliance requirements for banks with low DEI, and stimulate core technology research and development through market-oriented mechanisms to promote a step jump in banking efficiency.
DownloadPaper Citation
in Harvard Style
Fu J. (2025). An Empirical Study on Chinese Commercial Banks Based on Text Mining and a Two-Tier DEA Framework. In Proceedings of the 2nd International Conference on Data Science and Engineering - Volume 1: ICDSE; ISBN 978-989-758-765-8, SciTePress, pages 532-537. DOI: 10.5220/0013701500004670
in Bibtex Style
@conference{icdse25,
author={Jiaye Fu},
title={An Empirical Study on Chinese Commercial Banks Based on Text Mining and a Two-Tier DEA Framework},
booktitle={Proceedings of the 2nd International Conference on Data Science and Engineering - Volume 1: ICDSE},
year={2025},
pages={532-537},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013701500004670},
isbn={978-989-758-765-8},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 2nd International Conference on Data Science and Engineering - Volume 1: ICDSE
TI - An Empirical Study on Chinese Commercial Banks Based on Text Mining and a Two-Tier DEA Framework
SN - 978-989-758-765-8
AU - Fu J.
PY - 2025
SP - 532
EP - 537
DO - 10.5220/0013701500004670
PB - SciTePress