Prompt Injection Attacks on Large Language Models: Multi-Model Security Analysis with Categorized Attack Types

Selin Şaşal, Özgü Can

2025

Abstract

Large Language Models (LLMs) are widely used in information processing, language interaction, and decision support. The command-based structure of these systems creates security vulnerabilities that can be exploited through attacks designed to bypass security measures and generate malicious content. This study presents a comparative analysis of three LLMs (GPT-4o, Claude 4 Sonnet, and Gemini 2.5 Flash) based on four fundamental security metrics: compliance, filter bypass, sensitive information leakage, and security risk level. The study used an attack dataset containing unethical, harmful, and manipulation-oriented prompts. According to the results, the Claude model demonstrated the most robust security posture by providing secure responses with high consistency. Gemini was the most vulnerable due to filtering failures and information leakage. GPT-4o showed average performance, behaving securely in most scenarios but exhibiting inconsistency in the face of indirect attacks. The findings reveal that LLM security is influenced not only by content-level factors but also by structural factors such as model architectural design, training data scope, and filtering strategies. Therefore, it is critical to regularly test models against attacks and establish transparent, explainable, and ethics-based security principles.

Download


Paper Citation


in Harvard Style

Şaşal S. and Can Ö. (2025). Prompt Injection Attacks on Large Language Models: Multi-Model Security Analysis with Categorized Attack Types. In Proceedings of the 17th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR; ISBN , SciTePress, pages 517-524. DOI: 10.5220/0013838400004000


in Bibtex Style

@conference{kdir25,
author={Selin Şaşal and Özgü Can},
title={Prompt Injection Attacks on Large Language Models: Multi-Model Security Analysis with Categorized Attack Types},
booktitle={Proceedings of the 17th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR},
year={2025},
pages={517-524},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013838400004000},
isbn={},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 17th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR
TI - Prompt Injection Attacks on Large Language Models: Multi-Model Security Analysis with Categorized Attack Types
SN -
AU - Şaşal S.
AU - Can Ö.
PY - 2025
SP - 517
EP - 524
DO - 10.5220/0013838400004000
PB - SciTePress