GenGradAttack: Efficient and Robust Targeted Adversarial Attacks Using Genetic Algorithms and Gradient-Based Fine-Tuning

Naman Agarwal; James Pope

doi:10.5220/0012314700003636

GenGradAttack: Efficient and Robust Targeted Adversarial Attacks Using Genetic Algorithms and Gradient-Based Fine-Tuning

Naman Agarwal, James Pope

2024

Abstract

Adversarial attacks pose a critical threat to the reliability of machine learning models, potentially undermining trust in practical applications. As machine learning models find deployment in vital domains like au-tonomous vehicles, healthcare, and finance, they become susceptible to adversarial examples—crafted inputs that induce erroneous high-confidence predictions. These attacks fall into two main categories: white-box, with full knowledge of model architecture, and black-box, with limited or no access to internal details. This paper introduces a novel approach for targeted adversarial attacks in black-box scenarios. By combining genetic algorithms and gradient-based fine-tuning, our method efficiently explores input space for perturbations without requiring access to internal model details. Subsequently, gradient-based fine-tuning optimizes these perturbations, aligning them with the target model’s decision boundary. This dual strategy aims to evolve perturbations that effectively mislead target models while minimizing queries, ensuring stealthy attacks. Results demonstrate the efficacy of GenGradAttack, achieving a remarkable 95.06% Adversarial Success Rate (ASR) on MNIST with a median query count of 556. In contrast, conventional GenAttack achieved 100% ASR but required significantly more queries. When applied to InceptionV3 and Ens4AdvInceptionV3 on ImageNet, GenGradAttack outperformed GenAttack with 100% and 96% ASR, respectively, and fewer median queries. These results highlight the efficiency and effectiveness of our approach in generating adversarial examples with reduced query counts, advancing our understanding of adversarial vulnerabilities in practical contexts.

Download

Paper Citation

in Harvard Style

Agarwal N. and Pope J. (2024). GenGradAttack: Efficient and Robust Targeted Adversarial Attacks Using Genetic Algorithms and Gradient-Based Fine-Tuning. In Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART; ISBN 978-989-758-680-4, SciTePress, pages 202-209. DOI: 10.5220/0012314700003636

in Bibtex Style

@conference{icaart24,
author={Naman Agarwal and James Pope},
title={GenGradAttack: Efficient and Robust Targeted Adversarial Attacks Using Genetic Algorithms and Gradient-Based Fine-Tuning},
booktitle={Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART},
year={2024},
pages={202-209},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012314700003636},
isbn={978-989-758-680-4},
}

in EndNote Style

TY - CONF

JO - Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART
TI - GenGradAttack: Efficient and Robust Targeted Adversarial Attacks Using Genetic Algorithms and Gradient-Based Fine-Tuning
SN - 978-989-758-680-4
AU - Agarwal N.
AU - Pope J.
PY - 2024
SP - 202
EP - 209
DO - 10.5220/0012314700003636
PB - SciTePress