Research on the Path and Effectiveness of Large Model Empowering

Tax Collection and Administration

Siyu Chen

School of public finance and taxation, Zhongnan University of Economics and Law, Wuhan, 430073, P.R.China

Keywords: Large Model, Tax Collection and Administration, Empowering path, Collection and Administration

Effectiveness.

Abstract: The ongoing wave of digitalization has resulted in an unprecedented surge in both the volume and

complexity of tax data, thereby presenting significant challenges to conventional tax administration models.

This research explores the potential of large models as a transformative tool for tax governance. We delve

into the technical capabilities of large models, examining their application in critical areas such as data

processing,risk stratification,and the development of early warning systems. Furthermore, we investigate

their utility in enhancing taxpayer services. The integration of large model into tax operations promises to

substantially improve the efficiency of tax collection and administration. This approach can also strengthen

the data-driven foundation of decision-making processes and elevate taxpayer service satisfaction.

Ultimately, the adoption of large models represents a pivotal step towards the modernization and

optimization of tax collection and administration. Tax administration departments should continuously

upgrade their technological infrastructure, strengthen data management, and simultaneously enhance tax

personnel's understanding and application capabilities of large models to broaden the application scenarios

of large models, thereby achieving a qualitative leap in tax administration.

1 INTRODUCTION

1.1 Research Background and

Significance

The digital transformation presents significant

challenges for tax administration. The exponential

growth of tax data, driven by increasingly frequent

economic activities and heightened digitalization,

encompasses a vast array of taxpayer information.

Consequently, tax authorities must address the

effective collection, organization, analysis,and

utilization of this data. Concurrently,the complexity

of tax operations is escalating, with the emergence

of novel business models and emerging industries,

rendering traditional tax administration approaches

inadequate.

The advent of large model technology represents

a significant advancement in artificial intelligence,

offering novel opportunities for the modernization of

tax administration. Large model possess robust

capabilities in language comprehension, generation,

and knowledge inference, enabling in-depth analysis

and extraction of insights from extensive tax

datasets. This facilitates the precise profiling of

taxpayers and the proactive identification of

potential risks. Furthermore, through natural

language processing, large model can power

intelligent advisory services and automate tax-

related processes, thereby enhancing the efficiency

and quality of taxpayer services. Consequently,

investigating the application of large model in tax

administration holds substantial theoretical and

practical significance.

1.2 Literature Review

With the widespread application of large model

technology across various domains, its exploration

within the domain of taxation has intensified.

Currently, numerous scholars are directing their

attention towards integrating artificial intelligence

technologies, such as those based on natural

language processing, into tax administration

operations.

In the realm of tax administration digitalization,

research indicates a significant role for artificial

Chen, S.

Research on the Path and Effectiveness of Large Model Empowering Tax Collection and Administration.

DOI: 10.5220/0013842300004719

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 2nd International Conference on E-commerce and Modern Logistics (ICEML 2025), pages 249-254

ISBN: 978-989-758-775-7

249

intelligence (AI). Ni et al. (2021) posit that AI aids

in the digital transformation of tax management.

Yang and Yu (2023) suggest that the application of

natural language large model (LLM) technology can

enhance intelligent auditing, services, and decision-

making, thereby improving the efficiency of tax

operations. The Shenzhen Municipal Tax Bureau

research team (2023) has explored and proposed

typical application scenarios for AIGC products in

the tax domain. Chen et al. (2024) investigated the

application of LLMs in tax supervision,achieving

intelligent reading of non-resident tax

contracts,review of foreign payment filings, and risk

feedback functionalities. Zhou et al. (2025) explore

the advancement of MM-LLMs in tax collection and

management,significantly improving the accuracy of

tax data analysis and prediction, the level of tax risk

identification and management,and the quality of tax

consultation and services. Chen (2024) found that

EU member states have been early adopters of AI in

tax administration, focusing on tax risk management

and accelerating the development of taxpayer

services. South Korean researchers Y. H. Gu et al.

(2022) proposed the KTL-BERT pre-trained

language model, capable of automatically

classifying five types of tax issues. Y. Zhong et al.

(2024) introduced an LLM for intelligent tax

decision-making,integrating LLMs with domain-

specific databases and knowledge bases to enhance

the professionalism and timeliness of model

responses. Rahman, Sharifur et al. (2024) explore

how AI technologies optimize tax compliance, fraud

detection,and tax administration in modern tax

management. Giovanna Di Marzo Serugendo et al.

(2024) propose the use of AI-driven intelligent

document management systems to streamline tax

document management, with LLM further

enhancing optimization.

2 OVERVIEW OF LARGE

MODEL TECHNOLOGY

2.1 Fundamental Concepts and

Principles of Large Model

Large Models,also referred to as Foundation

Models,are characterized by their extensive

parameterization and intricate architecture,enabling

them to process vast datasets and execute complex

tasks across domains such as natural language

processing, computer vision, and speech recognition.

The Transformer architecture constitutes a core

principle of large model, departing from

conventional recurrent neural network structures and

integrating self-attention mechanisms. This

innovation allows models to effectively capture

dependencies between different positions within

sequential data, thereby significantly enhancing

computational efficiency and parallel processing

capabilities.

2.2 Common Types and

Characteristics of Large Model

2.2.1 Large Language Model

Large Language Models are a category of large

models within the field of Natural Language

Processing (NLP), typically employed for

processing textual data and understanding natural

language. The primary characteristic of these large

models is their training on extensive corpora to learn

the grammatical, semantic, and contextual rules of

natural language, exemplified by models such as

Wenxin Yiyan. In the domain of tax

administration,language large models can be utilized

to comprehend and process tax policy

documents,taxpayer inquiry texts, etc. Through the

study of a vast amount of tax-related text, these

models can accurately extract key information,

respond to taxpayer inquiries, and assist tax

personnel in policy interpretation and administrative

decision-making.

2.2.2 Vision Large Model

Vision Large Models (VLMs) represent expansive

models within the computer vision (CV)

domain,primarily utilized for image processing and

analysis. These models, trained on extensive image

datasets,facilitate various visual tasks, including

image classification, object detection, and facial

recognition, exemplified by models like INTERN

(SenseTime). In tax administration, the direct

application of VLMs is limited, yet they possess

potential in scenarios such as invoice image

recognition, where they can extract key information

to aid in tax auditing.

2.2.3 Multimodal Large Language Model

Multimodal Large Language Models (MM-LLMs)

are sophisticated models capable of processing

diverse data types, including text, images, and audio.

These models integrate Natural Language

Processing (NLP) and CV capabilities to

comprehensively understand and analyze

ICEML 2025 - International Conference on E-commerce and Modern Logistics

250

multimodal information. This enables a more

holistic comprehension and processing of complex

data. For instance, intelligent customer service

systems can simultaneously handle text and voice

data,enabling more natural interactions. Within tax

administration, MM-LLMs can integrate taxpayers'

textual declaration information with associated

image and video evidence, facilitating more

comprehensive administrative analysis. However,

the application of MM-LLMs in this domain is

currently in the exploratory phase.

3 EMPOWERING PATH OF

LARGE MODEL IN TAX

ADMINISTRATION

3.1 Data Processing and Analysis Path

Within the domain of tax administration, the sources

of data are extensive and structurally complex.

Large models can leverage their superior capabilities

to efficiently clean, integrate, and analyze multi-

source heterogeneous tax data, as illustrated in

Figure 1.

Initially,when tax data presents challenges such

as format errors, duplicate entries, missing

values,and outliers, large models can employ natural

language processing and machine learning

techniques for intelligent data cleansing.

Simultaneously, these models can analyze data

patterns to automatically populate missing values or

eliminate redundant records based on logical data

relationships, thereby enhancing data quality and

consistency. Subsequently, large language models

facilitate seamless data integration by overcoming

barriers between diverse tax data sources. Tax data

originates from various systems, including tax filing

systems,invoice management systems, and third-

party data platforms (e.g. banks,market regulatory

authorities), each with varying formats and

structures. Large models leverage their robust

semantic understanding capabilities to standardize

data from different sources,linking data of the same

entities (e.g.taxpayers,tax types). Ultimately,by

conducting in-depth analysis of the

integrated,extensive data, large models can uncover

hidden correlations, providing a comprehensive data

foundation for subsequent analyses.

Within the digitalization transformation process,

Her Majesty's Revenue and Customs (HMRC) has

proactively integrated large model technology to

revolutionize tax administration, thereby enhancing

efficiency and accuracy. Since 2015, HMRC has

established digital tax accounts, consolidating years

of tax filing data, taxpayer historical records,and

relevant tax regulations to construct a

comprehensive dataset for Large Model training.

Within the context of personal income tax return

audits, the previous manual review of extensive

personal income tax filings was time-consuming and

prone to errors. The implementation of large models

enables rapid processing of vast filing data.

Furthermore, large model analyze taxpayer income,

deductions, and tax reliefs to automatically identify

potential errors or anomalies in filings, thereby

alerting tax officials for focused attention.

Figure 1: Pathway of Large Models Empowering Data Processing and Analysis.

Research on the Path and Effectiveness of Large Model Empowering Tax Collection and Administration

251

3.2 Risk Identification and Early

Warning Path

Following data processing and analysis, large

models can construct tax risk assessment models to

monitor business operations and tax data in real-

time, enabling precise risk alerts. This capability

provides robust support for tax administration, as

illustrated in Figure 2.

Large models automatically select and combine

feature variables related to tax risk from processed

and analyzed data, constructing an effective risk

assessment indicator system. Tax authorities

transmit the latest business operation and tax data of

enterprises to the large model risk assessment

system in real-time. The large models then

calculates the risk score of the enterprise based on

the trained risk assessment model. When the risk

score exceeds a predefined threshold, an automatic

alert mechanism is triggered. The system generates a

detailed risk alert report, specifying the risk type,

severity, and potentially involved specific business

and transactions, providing tax personnel with

precise risk clues.

Figure 2: Pathway of Large Models Empowering Tax Risk Identification and Alert Systems.

Currently,the Hubei Provincial Tax Service is

piloting the integration of large models for

outbound payment management. During the

processing of an outbound payment filing for a

specific enterprise, the large model swiftly analyzed

the contract documentation and identified that the

stipulated technical service fees significantly

exceeded prevailing market rates, coupled with

anomalous payment methods. Following risk

stratification, the case was classified as high-risk,

prompting the immediate dissemination of an alert.

Tax administrators, upon receiving the

alert,promptly initiated an investigation, which

revealed that the enterprise was engaged in tax

evasion through the artificial inflation of technical

service fees to facilitate the offshore transfer of

profits. This intervention successfully mitigated tax

revenue losses and effectively averted tax risks.

3.3 Optimizing Taxpayer Service

Pathways

Large models can revolutionize taxpayer services by

enabling intelligent consultation and personalized

service delivery, thereby significantly enhancing

taxpayer satisfaction. The "Smart Home Tax"

system implemented by the Korean National Tax

Service serves as a prime example. Leveraging large

model technology, this system integrates data

collection and analysis, natural language processing,

personalized service customization, and continuous

learning and optimization. This approach provides

taxpayers with tailored tax advisory services, setting

a significant precedent for optimizing taxpayer

services and demonstrating the substantial potential

and unique application of large model in this

domain.

As illustrated in Figure 3, large models construct

extensive tax knowledge graphs by integrating

diverse tax policies, procedural guidelines, and

frequently asked questions. When taxpayers submit

inquiries through online platforms or mobile

applications, the large model employs Natural

Language Processing (NLP) to interpret the query's

intent and semantics. For instance, if a taxpayer

asks,"What are the latest income tax incentives for

small and micro-enterprises?" the large model can

promptly retrieve relevant policy information and

respond in accessible language, potentially including

details on eligibility criteria and filing procedures.

Furthermore, through in-depth analysis of taxpayer

data,large model offer personalized tax advisory

services. For newly registered businesses, the large

ICEML 2025 - International Conference on E-commerce and Modern Logistics

252

model proactively provides information on initial tax

registration processes and applicable tax incentives

based on industry and operational scale, facilitating

their rapid understanding of tax-related matters. The

large model also refines its recommendations based

on taxpayers' past inquiries and behavioral

preferences. To ensure the accuracy and

effectiveness of tax advisory services, tax authorities

regularly update the large model with new tax

policies, legislative revisions, and practical case

studies, enabling continuous learning and adaptation.

Figure 3: Pathways for Optimizing Tax Service Delivery through Large Models.

4 OPTIMIZATION OF TAX

ADMINISTRATION

EFFICIENCY THROUGH

LARGE MODELS

4.1 Enhancing Administrative

Efficiency

Traditional tax administration processes involve the

collection,organization, and analysis of vast datasets.

Manual processing is not only time-consuming and

labor-intensive but also prone to errors. The

integration of large model technology facilitates full-

process automation and intelligence. In the data

acquisition phase, large model can seamlessly

integrate with various governmental systems and

enterprise financial software to acquire taxpayers'

operational data, financial statements, and invoice

information in real-time and in batches, thereby

eliminating data silos and preventing redundant data

entry. During the selection of tax audit cases, large

model employ machine learning algorithms to

rapidly scan and filter tens of thousands of enterprise

data points, accurately identifying high-risk entities.

This significantly reduces case selection time,

enabling auditors to conduct targeted investigations.

In the tax collection phase, automated calculations

and intelligent comparisons ensure the accuracy of

tax calculations and expedite the collection process,

leading to an exponential increase in administrative

efficiency.

4.2 Strengthening the Scientific Basis

of Decision-Making

The formulation of tax policies and the allocation of

tax administration resources necessitate precise data

analysis and forecasting. Large models ,leveraging

their robust data mining and deep analytical

capabilities, can provide a solid foundation for

decision-making. By comprehensively analyzing

macroeconomic data,industry trends, and historical

tax data,large models can forecast the impacts of

various tax policy adjustments on fiscal revenue,

economic growth, and enterprise development,

thereby assisting tax authorities in formulating

scientifically sound tax policies. In terms of resource

allocation for tax administration, large model

analyze the distribution of tax risks across different

regions,industries,and enterprise sizes. This analysis

provides quantitative evidence for tax authorities to

rationally allocate human and material resources,

thereby avoiding resource waste, optimizing the

allocation of tax administration resources,and

enhancing overall tax administration efficiency.

Research on the Path and Effectiveness of Large Model Empowering Tax Collection and Administration

253

4.3 Enhancing Taxpayer Service

Satisfaction

Taxpayer service constitutes a critical component of

tax administration, and large models offer

significant advantages in optimizing these services.

Taking advantage of natural language processing

technology, large models can function as intelligent

virtual assistants, providing 24/7 support to

taxpayers. These systems can address a wide array

of tax-related inquiries, including policy

interpretations, procedural guidance, and assistance

with invoice generation, delivering prompt and

accurate responses, thereby minimizing taxpayer

wait times. Furthermore, large models facilitate

personalized services by analyzing taxpayer

data,such as industry specifics, operational scale,and

tax compliance history. This enables the tailored

dissemination of relevant tax incentives and

procedural reminders, thus assisting taxpayers in

maximizing policy benefits, reducing tax liabilities,

and improving overall satisfaction. This approach

fosters a positive tax administration environment.

5 CONCLUSION

In conclusion,this study provides an in-depth

analysis of the pathways and effectiveness of large

models in empowering tax administration.

Leveraging their robust language comprehension,

learning capabilities, and data processing abilities,

large models offer novel approaches to tax

administration in areas such as data processing, risk

assessment, and the optimization of taxpayer

services. This leads to significant improvements in

administrative efficiency, the scientific basis of

decision-making, and taxpayer service satisfaction.

This is of great significance for promoting the

intelligent and modern development of tax

administration. Tax administration departments

should continuously upgrade their technical

infrastructure, strengthen data management, enhance

tax personnel's understanding and application

capabilities of large models and expanding the

application scenarios of it, so that can achieve a

qualitative leap in tax administration.

However, this study also has certain limitations,

such as the lack of discussion on the potential risks

and countermeasures of large models applications in

tax administration pathways, as well as the

deficiency in research on specific practical cases of

integrating large model technology in different

countries.

Looking ahead,with continuous technological

innovation, the application prospects of large models

in the field of tax administration will be even

broader. We anticipate more diversified research and

practices to further explore the potential of large

models, improve the tax administration system,and

contribute to the development of the tax cause.

REFERENCES

Chen, F., Sun, K. 2024. Application of Natural Language

Large Models in Tax Supervision: Taking the Review

of Non-Resident Tax Contracts as an Example. Tax

Economics Research, 29(03), 38-46.

Chen, J. 2024. Comparative Analysis of the Application of

Artificial Intelligence in Tax Administration in EU

Member States. International Taxation, (04), 49-54.

Di Marzo, Serugendo, G., Cappelli, M. A., Falquet, G.,

Métral, C., Wade, A., Ghadfi, S., Cutting-Decelle, A.-

F., Caselli, A., Cutting, G. 2024. Streamlining Tax and

Administrative Document Management with AI-

Powered Intelligent Document Management System.

Information, 15(8), 461.

Gu, Y. H., Piao, X., Yin, H., Jin, D., Zheng R., Yoo, S. J.

2022. Domain-Specific Language Model Pre-Training

for Korean Tax Law Classification, in IEEE Access,

10, 46342-46353.

Moore, R., Lopes, J., 1999. Paper templates. In

TEMPLATE’06, 1st International Conference on

Template Production. SCITEPRESS.

Ni, J., Li, Y., Zhou, R. 2021. Strategies for Digital

Transformation of Tax Administration with Artificial

Intelligence. Tax Research, (04), 92-96.

Rahman, S., Khan, R., Sirazy, Md, Das, R. 2024. An

Exploration of Artificial Intelligence Techniques for

Optimizing Tax Compliance, Fraud Detection, and

Revenue Collection in Modern Tax Administrations.

56-80.

Shenzhen Municipal Tax Bureau Research Group, Li,W.,

Wang, X. 2023. Reflections and Suggestions on the

Application of Generative Artificial Intelligence

Represented by ChatGPT in the Tax Field. Tax Research,

(06), 5-9.

Smith, J., 1998. The book, The publishing company.

London, 2

edition.

Yang, S., Yu, L. 2023. Opportunities and Challenges of

Generative Artificial Intelligence Represented by

ChatGPT for Tax Administration. Tax Research, (06),

16-20.

Zhong, Y., Wong, D., Lan, K. 2024. Tax Intelligent

Decision-Making Language Model, in IEEE Access,

12, 146202-146212.

Zhou, Y., Mao, Y., Chen, L. 2025. Modernization of Tax

Collection and Administration: Innovative Application

and Exploration of MM-LLMs. Journal of Southeast

University (Philosophy and Social Science Edition),

27(01), 65-73+151.

ICEML 2025 - International Conference on E-commerce and Modern Logistics

254