An NLP-Based Framework Leveraging Email and Multimodal User Data
Neda Baghalizadeh-Moghadam, Frédéric Cuppens, Nora Boulahia-Cuppens
2025
Abstract
Traditional approaches for insider threat detection rely on analyzing activity logs to detect abnormal user activities. In this paper, we investigate how the exchange of messages between users could also contribute to detecting insider threats. This work presents an NLP-driven anomaly detection framework that incorporates feature engineering and prompt engineering across multimodal user activities, such as emails, HTTP requests, file access, and logon events. This study employs Named Entity Recognition (NER), Sentiment Analysis, and Prompt Engineering, to extract semantic, contextual, and behavioral insights that enhance anomaly detection. These enriched representations are processed by an Isolation Forest and One-Class Support Vector Machine (One-Class SVM) for the unsupervised detection of deviations from normal user behavior. Unlike most previous works that focus solely on user log activity datasets, our method incorporates both user log activity and email communication data for insider threat detection. Experimental results on the CERT r4.2 dataset demonstrate that the proposed multimodal approach improves anomaly detection with high accuracy, greater precision, and reduced false alarm rates. Hence, our framework offers greater explainability and scalability in addressing sophisticated insider threats.
DownloadPaper Citation
in Harvard Style
Baghalizadeh-Moghadam N., Cuppens F. and Boulahia-Cuppens N. (2025). An NLP-Based Framework Leveraging Email and Multimodal User Data. In Proceedings of the 22nd International Conference on Security and Cryptography - Volume 1: SECRYPT; ISBN 978-989-758-760-3, SciTePress, pages 168-178. DOI: 10.5220/0013524000003979
in Bibtex Style
@conference{secrypt25,
author={Neda Baghalizadeh-Moghadam and Frédéric Cuppens and Nora Boulahia-Cuppens},
title={An NLP-Based Framework Leveraging Email and Multimodal User Data},
booktitle={Proceedings of the 22nd International Conference on Security and Cryptography - Volume 1: SECRYPT},
year={2025},
pages={168-178},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013524000003979},
isbn={978-989-758-760-3},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 22nd International Conference on Security and Cryptography - Volume 1: SECRYPT
TI - An NLP-Based Framework Leveraging Email and Multimodal User Data
SN - 978-989-758-760-3
AU - Baghalizadeh-Moghadam N.
AU - Cuppens F.
AU - Boulahia-Cuppens N.
PY - 2025
SP - 168
EP - 178
DO - 10.5220/0013524000003979
PB - SciTePress