loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Authors: Meng Wang 1 ; Jing Xie 1 ; Yang Li 2 ; 1 ; Zhixiong Zhang 1 ; 2 and Hanyu Li 1 ; 2

Affiliations: 1 National Science Library, Chinese Academy of Science, Beijing, China ; 2 University of Chinese Academy of Science, Beijing, China

Keyword(s): Text Classification, Semantic Feature Encoding, Large Language Models, Feature Embedding.

Abstract: Accurate identification of semantic features in scientific texts is crucial for enhancing text classification performance. This paper presents a large language model text classification method with embedded semantic feature encoding, which enhances the model's understanding of textual semantics through a dual semantic feature encoding mechanism. The method employs a dynamic window-based local-global feature extraction strategy to capture topical semantic features and utilizes hierarchical structural aggregation mechanisms to extract organizational semantic information from texts. To fully leverage the extracted semantic features, we design a feature replacement encoding strategy that embeds topical semantic features and structural semantic features into the [CLS] and [SEP] positions of large language models, respectively, achieving deep fusion between semantic features and internal model representations, thereby improving the accuracy and robustness of text classification. Experiment al results demonstrate that the proposed semantic feature encoding enhancement method achieves significant performance improvements. On the DBPedia dataset, the semantically encoded SciBERT model achieves an F1-score of 91.07%, representing a 5.26% improvement over the original encoding approach. In the scientific literature value sentence identification task, Qwen3-14B combined with semantic feature encoding and QLora fine-tuning achieves an F1-score of 94.19%, showing a 14.64% improvement over the baseline model. Compared to traditional feature concatenation or simple fusion approaches, our feature replacement encoding strategy leverages semantic features at critical positions, significantly enhancing both classification precision and recall. Ablation experiments further validate the synergistic effects of topical semantic features and structural semantic features, confirming the effectiveness of the dual semantic feature encoding mechanism. The research findings highlight the advantages of semantic feature encoding in text classification tasks, providing an effective technical solution for intelligent analysis of scientific texts. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 216.73.216.108

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Wang, M., Xie, J., Li, Y., Zhang, Z. and Li, H. (2025). Enhanced LLM Text Classification Method with Embedded Semantic Feature Encoding. In Proceedings of the 17th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - KDIR; ISBN ; ISSN 2184-3228, SciTePress, pages 87-97. DOI: 10.5220/0013697900004000

@conference{kdir25,
author={Meng Wang and Jing Xie and Yang Li and Zhixiong Zhang and Hanyu Li},
title={Enhanced LLM Text Classification Method with Embedded Semantic Feature Encoding},
booktitle={Proceedings of the 17th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - KDIR},
year={2025},
pages={87-97},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013697900004000},
isbn={},
issn={2184-3228},
}

TY - CONF

JO - Proceedings of the 17th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - KDIR
TI - Enhanced LLM Text Classification Method with Embedded Semantic Feature Encoding
SN -
IS - 2184-3228
AU - Wang, M.
AU - Xie, J.
AU - Li, Y.
AU - Zhang, Z.
AU - Li, H.
PY - 2025
SP - 87
EP - 97
DO - 10.5220/0013697900004000
PB - SciTePress