| In recent years,with the rapid development of machine learning(ML)technology and the arrival of the era of big data,machine learning has been widely used and has achieved great success in many domains such as autonomous driving,recommendation systems,face recognition and natural language processing.More and more organizations are integrating ML technology to improve the quality of their products.However,current studies have shown that ML models are susceptible to various kinds of attacks,which cause security threats to ML models.Facing the threats of security issues in machine learning,this dissertation focuses on poisoning attacks in ML where attackers can manipulate the learning process by injecting a small number of malicious samples into the training dataset,and altering the behavior of ML models and reducing its prediction performance.This dissertation will study the security risks of data and models in ML from the following four aspects: black-box targeted poisoning attacks,class-targeted clean-label poisoning attacks,targeted poisoning attacks in self-supervised learning,and attackagnostic defense against poisoning attacks.The main contributions of this dissertation can be summarized as follows:1.Existing targeted poisoning attacks largely depend on the prior information of the target model and the training data.To tackle this limitation,a practical targeted poisoning attack based on machine unlearning without the prior information of the target model and the training data is proposed in this dissertation.The proposed attack leverages machine unlearning method to make specific targets not be learned by the victim model to alter the model’s decision boundary and uses gradient estimation to reduce the knowledge required by the attacker to achieve a targeted poisoning attack.This work demonstrated that ML models can still face the serious risks of data security threat even without the knowledge of the training data and the target model.2.A novel data poisoning attack is further put forward under a class-targeted manner,and it illustrates that the ML models in real-world scenarios still suffer the risk of data security.Most of the existing data poisoning techniques are designed for only specific data,but in practice,the attacker will face a series of testing data,which will limit the capability of the targeted poisoning attacks.To this end,a class-targeted clean-label poisoning attack is proposed.This attack alters the decision boundary of the target model by generating clean label poisoned data in the feature space.At the same time,it uses the feature information between different classes to ensure the classification accuracy of other classes to achieve class-targeted poisoning attack.3.Existing targeted poisoning attacks mainly focus on supervised learning or semisupervised learning,leaving the potential security vulnerabilities in self-supervised learning largely unexplored.To alleviate this limitation,a targeted poisoning attack method under self-supervised learning based on feature matching is presented in this dissertation.This work does not need to compromise the integrity of the pre-training dataset,but directly attacks the pre-training encoder.Based on the poisoned pre-training encoder,downstream classifiers trained by this encoder will inherit the poisoning behavior.The key idea of this scheme is to maximize the similarity between the feature representations obtained after the targeted sample passes through the poisoned encoder and the average feature representations belonging to the class selected by the attacker.Then the downstream classifier will inherit the poisoning behavior based on the poisoned pre-trained encoder through transfer learning,thus realizing a self-supervised poisoning attack.4.Existing defensive methods against poisoning attacks require obtaining the type of poisoning attacks.To address this issue,an attack-agnostic defense against data poisoning attacks is designed in this dissertation.This work does not require to know the type of poisoning attacks.The key idea of this defense scheme is to train a mimic model whose purpose is to imitate the prediction behavior of the target model.This defense generates a mimic model based on a generative adversarial network.By comparing the prediction difference between the mimic model and the target model,it can directly distinguish poisoned samples or models from clean one,so as to realize an attack-agnostic defense scheme for various types of poisoning attacks. |