loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Authors: Meng Wang 1 ; Mengting Zhang 1 ; 2 ; Hanyu Li 1 ; 2 ; Jing Xie 1 ; Zhixiong Zhang 1 ; 2 ; Yang Li 1 ; 2 and Gaihong Yu 1

Affiliations: 1 National Science Library, Chinese Academy of Sciences, Beijing 100190, China ; 2 School of Economics and Management, University of Chinese Academy of Sciences, Beijing 100190, China

Keyword(s): Innovative Sentence Identification, Multi-Class Text Classification, Time Mixing Attention, Mixture of Experts, Generative Semantic Data Augmentation.

Abstract: Accurately classifying innovative sentences in scientific literature is essential for understanding research contributions. This paper proposes a two-phase classification framework that integrates a Time Mixing Attention (TMA) mechanism and a Mixture of Experts (MoE) system to enhance multi-class innovation classification. In the first phase, TMA improves long-range dependency modeling through temporal shift padding and sequence slice reorganization. The second phase employs an MoE-based approach to classify theoretical, methodological, and applied innovations. To mitigate class imbalance, a generative semantic data augmentation method is introduced, improving model performance across different innovation categories. Experimental results demonstrate that the proposed two-phase SciBERT+TMA model achieves the highest performance, with a macroaveraged F1-score of 90.8%, including 95.1% for theoretical innovation, 90.8% for methodological innovation, and 86.6% for applied innovation. Com pared to the one-phase SciBERT+TMA model, the two-phase approach significantly improves precision and recall, highlighting the benefits of progressive classification refinement. In contrast, the best-performing LLM baseline, Ministral-8B-Instruct, achieves a macro-averaged F1-score of 85.2%, demonstrating the limitations of prompt-based inference in structured classification tasks. The results underscore the advantage of a domain-adapted approach in capturing fine-grained distinctions in innovation classification. The proposed framework provides a scalable solution for multi-class sentence classification and can be extended to broader academic classification tasks. Model weights and details are available at https://huggingface.co/wmsr22/Research Value Generation/tree/main. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 216.73.216.157

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Wang, M., Zhang, M., Li, H., Xie, J., Zhang, Z., Li, Y., Yu and G. (2025). Innovative Sentence Classification in Scientific Literature: A Two-Phase Approach with Time Mixing Attention and Mixture of Experts. In Proceedings of the 14th International Conference on Data Science, Technology and Applications - Volume 1: DATA; ISBN 978-989-758-758-0; ISSN 2184-285X, SciTePress, pages 382-389. DOI: 10.5220/0013513200003967

@conference{data25,
author={Meng Wang and Mengting Zhang and Hanyu Li and Jing Xie and Zhixiong Zhang and Yang Li and Gaihong Yu},
title={Innovative Sentence Classification in Scientific Literature: A Two-Phase Approach with Time Mixing Attention and Mixture of Experts},
booktitle={Proceedings of the 14th International Conference on Data Science, Technology and Applications - Volume 1: DATA},
year={2025},
pages={382-389},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013513200003967},
isbn={978-989-758-758-0},
issn={2184-285X},
}

TY - CONF

JO - Proceedings of the 14th International Conference on Data Science, Technology and Applications - Volume 1: DATA
TI - Innovative Sentence Classification in Scientific Literature: A Two-Phase Approach with Time Mixing Attention and Mixture of Experts
SN - 978-989-758-758-0
IS - 2184-285X
AU - Wang, M.
AU - Zhang, M.
AU - Li, H.
AU - Xie, J.
AU - Zhang, Z.
AU - Li, Y.
AU - Yu, G.
PY - 2025
SP - 382
EP - 389
DO - 10.5220/0013513200003967
PB - SciTePress