| With the continuous development of AI technology,it has been widely used in various fields,such as medical,military,autonomous driving,finance,etc.However,due to the inherent vulnerability and uninterpretability of deep learning models,they are vulnerable to adversarial attacks,and even small perturbations or disturbances may make the models produce wrong outputs.Adversarial attacks can be divided into white-box attacks and black-box attacks according to the degree of information known about the target model.The study of black-box attacks is more relevant due to the limited information available in the real environment and the fact that the target model is often unknown.At present,there are defects in black-box attacks.In soft-label black-box attacks,there is a problem that the disturbance is too large and easy to be detected by human eyes;in hard-label black-box attacks,due to the large number of queries,it may lead to greater resource consumption and cost.In view of the above problems,this paper mainly completes the following work:(1)A soft-label black-box attack method based on local interpretability is proposed.The method reduces the size of perturbation in generating adversarial samples by adding perturbations to the discriminative regions of the sample.First,a local interpretability method is used to interpret the black box model and obtain the discriminative regions of the sample;then,a natural evolutionary strategy by a derivativefree optimization method is used to estimate the gradient information of the sample,whereby a perturbation is added to this regions to form an adversarial sample.The method is applied to several neural network models,and the experimental results show that the attack success rate of the method is above 90%,and the perturbation size is substantially reduced compared with other attack methods.(2)A hard-label black-box attack method based on local search is proposed.This method is proposed to transform the adversarial sample generation problem into moving the original sample outside the decision boundary with minimum cost,and a local search algorithm is used to solve the problem.First,the search direction and step size are randomly generated to generate an initial solution,and then a number of neighboring solutions are generated around the initial solution;the neighboring solutions are evaluated using an evaluation function designed according to the objective,and the optimal solution among them is selected and compared with the current solution.If it is better than the current solution,the current solution is updated,otherwise it is not updated.The above process is continuously iterated until the optimal solution that meets the requirements is found or the termination condition is reached.By comparing with existing hard-labeled black-box attack algorithms,the proposed method in this paper has a significant decrease in the number of queries with high attack success rate.(3)For the above study,this paper designs and implements a neural network model adversarial attack system.The system is designed in detail from several aspects,such as user registration and login,dataset and model upload,execution of evaluation tasks,and evaluation result query.The system can realize the generation of adversarial samples under different constraints of soft and hard labels,thus providing ideas for improving the robustness of the target model in a targeted manner. |