| With the rapid development of deep neural networks,the performance of neural networks has been improving and deeply influencing people’s daily life in recent years.They have been widely used in the field of computing for visual classification problems and extended to a variety of basic ap-plications such as autonomous driving,face recognition,and malicious detection.Researchers have also begun to investigate the security and reliability associated with deep learning techniques.Deep neural networks are susceptible to small perturbations that lead to completely incorrect outputs,and clean examples with these specific perturbations are called adversarial examples.Adversarial exam-ples largely threaten the security of AI applications,but they can also be used to measure the robustness of neural network models,advance advances in adversarial defense techniques,and help researchers gain a more intuitive and in-depth understanding of the inner workings of neural network models.Therefore,the generation of adversarial examples deserves further research.Most of the existing adversarial attack methods are implemented by white-box attack methods on the spatial domain using the theoretical basis of gradient attack.However,the proposal of adversarial training makes it difficult for white-box attacks to work.The reason is that the essence of adversarial training is to obfuscate the gradient of neural network models so as to achieve the effect of adversarial defense.In this paper,two difficulties in the field of existing adversarial attacks are extracted: First,the white-box attack method,in which the attacker needs to know all the network model structure and parameters,is difficult to implement in realistic attack scenarios,while the image quality of the adversarial examples generated by the black-box attack method is inferior to that of the white-box adversarial attack method.Secondly,many defence schemes have now been proposed for gradient attacks done over the spatial domain.To address these two difficulties,this paper proposes a black box attack method combined with digital watermarking,by embedding a specific digital watermarked image in a clean example using a multi-objective optimization algorithm to generate an adversarial example,distributing the adver-sarial perturbation uniformly throughout the space in the frequency domain to ensure the impercep-tibility of the adversarial example,while still having robustness in the face of the underlying model preprocessing defense operation.And further from the frequency domain perspective,we propose the idea of a high-frequency black-box attack,which concentrates the adversarial perturbation in the high-frequency region where the human eye is not sensitive and improves the image quality of the adversarial example while ensuring the attack effect.The problem of poor image quality of the adver-sarial examples generated by the black-box attack method and poor robustness in the face of common defense operations is solved.Specifically,it includes the multi-objective optimized adversarial attack algorithm based on digital watermarking and the high-frequency black-box attack algorithm based on digital watermarking,and the main research works and innovations are as follows.(1)A multi-objective optimal adversarial attack algorithm based on digital watermarking.The digital watermark has two properties of transparency and robustness,i.e.,imperceptibility to the naked eye and robustness after defense operations,which match our expectation for the adversarial example.In this paper,we propose a non-dominated ranking genetic algorithm based on particle swarm opti-mization to continuously optimize the attack capability and image quality of the adversarial example by dynamically adjusting the rotation angle and embedding strength of the digital watermark.The digital watermarking adversarial examples generated by this method have higher attack success rate and better image quality as well as require fewer queries than other black-box attack methods,and also demonstrate strong robustness against common adversarial defense methods such as random cropping and JPEG compression.(2)High-frequency black box attack algorithm based on digital watermarking.In the evaluation of image quality,the human eye is insensitive to changes in high-frequency information and more sensitive to changes in low-frequency information,so in this paper,we consider converting the adver-sarial perturbation to the high-frequency region in the frequency domain to reduce the perceptibility of the adversarial example and improve the image quality.We convert the image from the spatial domain to the frequency domain by Fourier transform,and use the improved objective function to enhance the high-frequency component and suppress the low-frequency component to focus the adversarial perturbation on the high-frequency region in the frequency domain,which improves the image qual-ity of the adversarial example while ensuring the attack power.And the NSGA-PSO(r)algorithm with adaptive filter radius is proposed to dynamically change the threshold of high and low-frequency separation and the embedding strength of the digital watermark,allowing the adversarial examples to evolve toward higher attack power and better image quality. |