Large Language Models as Carriers of Hidden Messages

Jakub Hoscilowicz; Pawel Popiolek; Jan Rudkowski; Jedrzej Bieniasz; Artur Janicki

doi:10.5220/0013498800003979

Large Language Models as Carriers of Hidden Messages

Jakub Hoscilowicz, Pawel Popiolek, Jan Rudkowski, Jedrzej Bieniasz, Artur Janicki

2025

Abstract

Simple fine-tuning can embed hidden text into large language models (LLMs), which is revealed only when triggered by a specific query. Applications include LLM fingerprinting, where a unique identifier is embedded to verify licensing compliance, and steganography, where the LLM carries hidden messages disclosed through a trigger query. Our work demonstrates that embedding hidden text via fine-tuning, although seemingly secure due to the vast number of potential triggers, is vulnerable to extraction through analysis of the LLM’s output decoding process. We introduce an extraction attack called Unconditional Token Forcing (UTF), which iteratively feeds tokens from the LLM’s vocabulary to reveal sequences with high token probabilities, indicating hidden text candidates. We also present Unconditional Token Forcing Confusion (UTFC), a defense paradigm that makes hidden text resistant to all known extraction attacks without degrading the general performance of LLMs compared to standard fine-tuning. UTFC has both benign (improving LLM fingerprinting) and malign applications (using LLMs to create covert communication channels).

Download

Paper Citation

in Harvard Style

Hoscilowicz J., Popiolek P., Rudkowski J., Bieniasz J. and Janicki A. (2025). Large Language Models as Carriers of Hidden Messages. In Proceedings of the 22nd International Conference on Security and Cryptography - Volume 1: SECRYPT; ISBN 978-989-758-760-3, SciTePress, pages 363-371. DOI: 10.5220/0013498800003979

in Bibtex Style

@conference{secrypt25,
author={Jakub Hoscilowicz and Pawel Popiolek and Jan Rudkowski and Jedrzej Bieniasz and Artur Janicki},
title={Large Language Models as Carriers of Hidden Messages},
booktitle={Proceedings of the 22nd International Conference on Security and Cryptography - Volume 1: SECRYPT},
year={2025},
pages={363-371},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013498800003979},
isbn={978-989-758-760-3},
}

in EndNote Style

TY - CONF

JO - Proceedings of the 22nd International Conference on Security and Cryptography - Volume 1: SECRYPT
TI - Large Language Models as Carriers of Hidden Messages
SN - 978-989-758-760-3
AU - Hoscilowicz J.
AU - Popiolek P.
AU - Rudkowski J.
AU - Bieniasz J.
AU - Janicki A.
PY - 2025
SP - 363
EP - 371
DO - 10.5220/0013498800003979
PB - SciTePress