Liu, C., Zhang, H., Zhao, K., et al. (2024). LLMEmbed:
Rethinking Lightweight LLM's Genuine Function in
Text Classification. arXiv preprint arXiv:2406.03725.
Devlin, J., Chang, M. W., Lee, K., et al. (2019). Bert: Pre-
training of deep bidirectional transformers for language
understanding. Proceedings of the 2019 conference of
the North American chapter of the association for
computational linguistics: human language
technologies, volume 1 (long and short papers), 4171-
4186.
Ezugwu, A. E., Ho, Y.-S., Egwuche, O. S., et al. (2024).
Classical Machine Learning: Seventy Years of
Algorithmic Learning Evolution. Data Intelligence.
https://www.sciengine.com/doi/10.3724/2096-
7004.di.2024.0051
Zhao, J., Lan, M., Niu, Z. Y., et al. (2015). Integrating word
embeddings and traditional NLP features to measure
textual entailment and semantic relatedness of sentence
pairs. Proceedings of the International Joint
Conference on Neural Networks, 1-7.
Pangakis, N., & Wolken, S. (2024). Knowledge distillation
in automated annotation: Supervised text classification
with llm-generated training labels. arXiv preprint
arXiv:2406.17633.
Kokkodis, M., Demsyn-Jones, R., & Raghavan, V. (2025).
Beyond the Hype: Embeddings vs. Prompting for
Multiclass Classification Tasks. arXiv preprint
arXiv:2504.04277.
Wang, M., Zhang, Z., Li, H., et al. (2024). An Improved
Meta-Knowledge Prompt Engineering Approach for
Generating Research Questions in Scientific Literature.
Proceedings of the 16th International Joint Conference
on Knowledge Discovery, Knowledge Engineering and
Knowledge Management - Volume 1: KDIR, 457-464.
Achiam, J. (2023). GPT-4 technical report. arXiv preprint
arXiv:2303.08774.
Touvron, H., Lavril, T., Izacard, G., et al. (2023). LLaMA:
Open and efficient foundation language models. arXiv
preprint arXiv:2302.13971.
Guo, Z., Jiao, K., Yao, X., et al. (2024). USTC-BUPT at
SemEval-2024 Task 8: Enhancing Machine-Generated
Text Detection via Domain Adversarial Neural
Networks and LLM Embeddings. Proceedings of the
18th International Workshop on Semantic Evaluation
(SemEval-2024), 1511-1522..
Li X, Zhang Z, Liu Y et al. (2022). A Study on the Method
of Identifying Research Question Sentences in
Scientific Articles[J]. Library and Information Service,
67(9) pp 132-140
Liu P, Cao Y (2022). A named entity recognition method
for Chinese winter sports news based on RoBERTa-
WWM[C] Proc. 3rd Int. Conf. Big Data, Artif. Intell.
Internet Things Eng. (ICBAIE). pp 785-790.
Liu X, Zhao W, Ma H (2022). Research on domain-specific
knowledge graph based on the RoBERTa-wwm-ext
pretraining model[J]. Comput. Intell. Neurosci., pp 1-
11.
Han Y (2023). Advancing Text Analytics: Instruction Fine-
Tuning of QianWen-7B for Sentiment Classification[C]
Proceedings of the 2023 4th International Conference
on Big Data Economy and Information Management.
pp 90-93.
Zhang Y, Wang M, Ren C, et al. (2024). Pushing the limit
of LLM capacity for text classification[EB/OL].
arXiv:2402.07470.
Chae Y, Davidson T (2024). Large Language Models for
Text Classification: From Zero-Shot Learning to
Instruction-Tuning[J]. Sociological Methods &
Research, 00491241251325243.
[21] Fatemi S, Hu Y, Mousavi M (2025). A Comparative
Analysis of Instruction Fine-Tuning Large Language
Models for Financial Text Classification[J].
Management Information Systems, 16(1).
Peng L, Shang J (2024). Incubating text classifiers
following user instruction with nothing but
LLM[EB/OL]. arXiv:2404.10877.
Meguellati E, Zeghina A, Sadiq S, et al. (2025). LLM-based
Semantic Augmentation for Harmful Content
Detection[EB/OL]. arXiv:2504.15548.
Guo Y, Ovadje A, Al-Garadi M A, et al. (2024). Evaluating
large language models for health-related text
classification tasks with public social media data[J].
Journal of the American Medical Informatics
Association, 31(10) pp 2181-2189.
Lehmann J, Isele R, Jakob M, et al. (2015). DBpedia--a
large-scale, multilingual knowledge base extracted
from Wikipedia[J]. Semantic Web, 6(2): 167-195.
Beltagy I, Lo K, Cohan A (2019). SciBERT: A pretrained
language model for scientific text[EB/OL].
arXiv:1903.10676.
Yang A, Li A, Yang B, et al. (2025). Qwen3 technical
report[EB/OL]. [2025]. arXiv:2505.09388.
Meta A I. The llama 4 herd: The beginning of a new era of
natively multimodal ai innovation[EB/OL].
https://ai.meta.com/blog/llama-4-multimodal-
intelligence/.
GLM T, Zeng A, Xu B, et al. (2024). Chatglm: A family of
large language models from glm-130b to glm-4 all
tools[EB/OL]. arXiv:2406.12793.
Loshchilov I, Hutter F (2019). Decoupled weight decay
regularization[C] International Conference on
Learning Representations (ICLR).
Dettmers T, Pagnoni A, Holtzman A, et al. (2023). QLoRA:
Efficient finetuning of quantized LLMs[C] Advances in
Neural Information Processing Systems (NeurIPS).