Font Size: a A A

Label-free Data Poisoning Attack Against Deep Unsupervised Domain Adaptation

Posted on:2022-07-27Degree:MasterType:Thesis
Country:ChinaCandidate:W X LiuFull Text:PDF
GTID:2518306497492604Subject:Cyberspace security
Abstract/Summary:PDF Full Text Request
With the advance of big data technology,massive amounts of data in different domains continue to accumulate without labeling.Deep unsupervised domain adaptation that can learn the domain-invariant feature representation has gradually been favored by artificial intelligence workers.The technology improves the generalization ability of artificial intelligence models for different data domains by transferring the knowledge of the source domain to the unlabeled target domain.However,existing works show that the security and availability of data-based artificial intelligence technology is facing the threat of data poisoning attacks.Attackers can manipulate models by adding malicious samples to the training data.However,existing poisoning attacks are only demonstrated to be effective in supervised learning,while the vulnerability of deep unsupervised domain adaption to poisoning attacks has not been explored.We give the first attempt to analyze the vulnerability of deep unsupervised domain adaptation and propose the unsupervised data poisoning attack schemes in the white box scene and the gray box scene.By injecting poisoning data into the source domain,the attackers can mislead the process of domain confusion without supervision.Specifically,for the white box attack scenario,we propose an unsupervised data poisoning attack scheme,called LFPA-w,based on bi-level optimization.First,the potential correlation between the source domain and the target domain is used to provide pseudo-labels for the target domain data,and then we destroy the correlation between the target domain data and the pseudo-labels based on bi-level programming.For gray box attack scenarios,we propose a universal data poisoning attack scheme,called LFPA-g,based on ensemble models.Through the learning of multiple domain adaptation models,a unified poisoning optimization direction is obtained,which has an effect on multiple models,and realizes a data poisoning attack across models.In order to improve the effect of poisoning attacks and reduces the computational complexity,this paper optimizes the whole cycle of "initial point selection-gradient direction calculation-gradient direction correction",and uses the influence function to select the training sample that contributes the most to the attack target as the initial poisoning points.The -step reverse differentiation method is used to reduce the time complexity of bi-level optimization,and at the same time,the gradient which has the negative effect on the training loss is tailored.Finally,extensive experiments on real-world datasets demonstrate the effectiveness of the proposed LFPA and the high vulnerability of domain adaptation to poisoning attacks.
Keywords/Search Tags:domain adaptation, unsupervised data poisoning, data availability, universal data poisoning
PDF Full Text Request
Related items