| Artificial intelligence models with deep neural networks as the core have been widely deployed in various fields,such as image recognition,autonomous driving,voice recognition,text recognition,video processing and other fileds due to its rapid development and excellent performance in recent years.However,deep neural network models are easily attacked by adversarial examples.The attackers add some mild perturbations to the input.These perturbations are usually Imperceptible,but the model can output wrong results with high confidence.The discovery of adversarial examples has raised concerns about the security of deep neural network applications.In order to better apply it to the actual scenario,it has gradually become a research hotspot for deep neural network robust problems.Adversarial example attack and defense technology are important means to evaluate and improve deep neural network robustness.Therefore,studying the adversarial example offensive and defensive technology in deep neural networks has important academic research significance and commercial application value.This dissertation studies the problems of low attack success rate,low attack efficiency,poor generalization of defense and low robustness of existing adversarial example attack and defense techniques in deep neural networks.This dissertation firstly provides a comprehensive summary of the existing adversarial example attack principles and defense methods.Then,this dissertation deeply discusses the attack and defense techniques of adversarial examples in deep neural networks in different settings and situations,and focuses on automatic speech recognition systems and image classification tasks,two of the most widely used and critical areas of deep neural networks as target areas.Analyze their challenges in attack and defense research,and give corresponding solutions.The research contents and contributions of this dissertation are as follows:(1)The existing minimize-perturbation white-box adversarial example attack that needs to optimize two objectives at the same time,which has the problems of low attack efficiency and too much perturbations.This dissertation takes the automatic speech recognition task as the carrier,and proposes a fast targeted speech adversarial example attack algorithm based on dynamic norm,which dynamically adjusts the perturbation budget according to whether the sample is successfully attacked during the attack process,and gradually reduces the attack step size to avoid oscillation near the optimal solution.The experimental results show that the attack method proposed in this dissertation can achieve a 100%targeted attack success rate on mainstream automatic speech recognition systems with very few iterations,and the added disturbance is also very small.(2)The existing black-box adversarial attacks are mainly divided into transfer-based attacks and query-based attacks,which have the problems of low success rate of migration attacks and low efficiency of query attacks.This dissertation takes the image classification task as the carrier,combines transfer-based attacks with query-based attacks,and proposes a Bayesian optimization black-box adversarial example attack method with transferable priors.The algorithm uses the generator to learn perturbations with strong transferability as priors and uses Bayesian optimization to search in the generator’s low-dimensional embedding space,making full use of perturbation priors and prior knowledge of each query.After comparative experiments on three datasets,six models and the current state-of-the-art blackbox adversarial example attack methods,the black-box adversarial example attack method in this dissertation can achieve 98%with an average single-digit number of queries.(3)The active defense strategy based on preprocessing usually only has defense effect on some simple attacks or attacks included in the training set,and has the problems of poor generalization ability and weak defense effect.In addition,preprocessing defense methods also reduce the classification accuracy of clean samples.This dissertation designs a new image reconstruction network based on residual connections to defend against adversarial example attacks,and uses perceptual loss to eliminate the error amplification effect.To further eliminate adversarial perturbations,our method adds a randomization layer at the end of the reconstructed network.Experiments show that the image preprocessing defense method based on residual connection proposed in this dissertation hardly affects the classification accuracy of clean samples and has strong generalization.It has a minimum defense accuracy of over 69.44%against ten mainstream adversarial example attacks on the Inception-ResNetv2 model.(4)The regularization-based adversarial training passive defense strategy can still maintain a certain robustness under white-box attacks,but there are problems of robust overfitting and the trade-off between natural accuracy and robustness.This dissertation first empirically verifies the relationship between consistency regularization and robust overfitting,and finds that consistency regularization can effectively alleviate robust overfitting in adversarial training.Subsequently,a teacher model consistency regularization adversarial training method based on exponential average moving is further proposed.The experimental results on three different datasets show that the adversarial training defense method proposed in this dissertation can effectively alleviate robust overfitting and improve robustness.The robust accuracy of the attack is increased by 4%,and the robustness of the AA attack is increased by 3.79%.At the same time,the robust generalization gap can be reduced by 24.64%. |