’Adaptive Perturbation for Adversarial Attack'

英文题目：《Adaptive Perturbation for Adversarial Attack》

论文作者：YuanZheng,ZhangJie,JiangZhaoyan,LiLiangliang,ShanShiguang

发布于：IEEE Transactions on Pattern Analysis and Machine Intelligence

发布时间：2024/8

级别：CCF A

论文链接：10.1109/TPAMI.2024.3367773

摘要

In recent years, the security of deep learning models achieves more and more attentions with the rapid development of neural networks, which are vulnerable to adversarial examples.Almost all existing gradient-based attack methods use the sign function in the generation to meet the requirement of perturbation budget on $L_\infty$ norm. However, we find that the sign function may
be improper for generating adversarial examples since it modifies the exact gradient direction. Instead of using the sign function,we propose to directly utilize the exact gradient direction with
a scaling factor for generating adversarial perturbations, which improves the attack success rates of adversarial examples even with fewer perturbations. At the same time, we also theoretically
prove that this method can achieve better black-box transferability.Moreover, considering that the best scaling factor varies across different images,wepropose anadaptive scaling factor generator to
seek an appropriate scaling factor for each image, which avoids the computational cost for manually searching the scaling factor. Our method can be integrated with almost all existing gradient-based attack methods to further improve their attack success rates. Extensive experiments on the CIFAR10 and ImageNet datasets show that our method exhibits higher transferability and outperforms the state-of-the-art methods.

本文聚焦的问题

Almost all existing gradient-based attack methods use the sign function in the generation to meet the requirement of perturbation budget on $L_\infty$ norm. However, we find that the sign function may be improper for generating adversarial examples since it modifies the exact gradient direction.

本文提出的方法

Instead of using the sign function,we propose to directly utilize the exact gradient direction with a scaling factor for generating adversarial perturbations, which improves the attack success rates of adversarial examples even with fewer perturbations.