An NLP-Based Framework Leveraging Email and Multimodal User Data

Neda Baghalizadeh-Moghadam, Frédéric Cuppens, Nora Boulahia-Cuppens

2025

Abstract

Traditional approaches for insider threat detection rely on analyzing activity logs to detect abnormal user activities. In this paper, we investigate how the exchange of messages between users could also contribute to detecting insider threats. This work presents an NLP-driven anomaly detection framework that incorporates feature engineering and prompt engineering across multimodal user activities, such as emails, HTTP requests, file access, and logon events. This study employs Named Entity Recognition (NER), Sentiment Analysis, and Prompt Engineering, to extract semantic, contextual, and behavioral insights that enhance anomaly detection. These enriched representations are processed by an Isolation Forest and One-Class Support Vector Machine (One-Class SVM) for the unsupervised detection of deviations from normal user behavior. Unlike most previous works that focus solely on user log activity datasets, our method incorporates both user log activity and email communication data for insider threat detection. Experimental results on the CERT r4.2 dataset demonstrate that the proposed multimodal approach improves anomaly detection with high accuracy, greater precision, and reduced false alarm rates. Hence, our framework offers greater explainability and scalability in addressing sophisticated insider threats.

Download


Paper Citation


in Harvard Style

Baghalizadeh-Moghadam N., Cuppens F. and Boulahia-Cuppens N. (2025). An NLP-Based Framework Leveraging Email and Multimodal User Data. In Proceedings of the 22nd International Conference on Security and Cryptography - Volume 1: SECRYPT; ISBN 978-989-758-760-3, SciTePress, pages 168-178. DOI: 10.5220/0013524000003979


in Bibtex Style

@conference{secrypt25,
author={Neda Baghalizadeh-Moghadam and Frédéric Cuppens and Nora Boulahia-Cuppens},
title={An NLP-Based Framework Leveraging Email and Multimodal User Data},
booktitle={Proceedings of the 22nd International Conference on Security and Cryptography - Volume 1: SECRYPT},
year={2025},
pages={168-178},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013524000003979},
isbn={978-989-758-760-3},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 22nd International Conference on Security and Cryptography - Volume 1: SECRYPT
TI - An NLP-Based Framework Leveraging Email and Multimodal User Data
SN - 978-989-758-760-3
AU - Baghalizadeh-Moghadam N.
AU - Cuppens F.
AU - Boulahia-Cuppens N.
PY - 2025
SP - 168
EP - 178
DO - 10.5220/0013524000003979
PB - SciTePress