| With the development of artificial intelligence technology,AI security has become a research hotspot in the field of data security.On the one hand,the increasingly basic computing power and the rapid growth of user data make machine learning have more and more excellent development and performance in all scenarios;On the other hand,more and more user data are exposed in the training scenario of the model,and the model privacy security is becoming more and more important.The machine learning model trained with sensitive datasets may lead to the disclosure of user privacy and potential user data security risks.In order to solve the security problem caused by user data in model training,more and more research has focused on the field of machine learning security.Property inference attack is an attack method against model training set in the field of machine learning security.It uses the model parameters as a priori knowledge,and infers the global attribute statistics of the model training set through the training attack model.This attack method is widely used in scenes with different model structures,such as support vector machine model,fully connected neural network and federated learning.The characteristics of property inference attack and its wide application scenarios have caused great dangers to the security of user privacy.According to the existing literature,there is no defense method against property inference attack.In order to protect data security and safeguard the legitimate rights and interests of data providers,this thesis proposes a defense method against property inference attack,and puts forward the solution of model security training from the perspective of user privacy security.The specific research work includes the following aspects:(1)Research on training method of structural dataset based on shadow dataset.This method uses the shadow dataset for model training to eliminate the hidden mode of leaking user information that may exist in the model parameters,while maintaining the learning of the model to the user dataset,so as to achieve the purpose of model protection.The method includes five steps: Flag model training,feature-space extraction,shadow dataset generation,defense model generation,user privacy security and model utility trade-off.Experiments show that when the model trained by this method is attacked by property inference attack,the attributes reflected by the model are different from the real attributes of the user dataset,which ensures the privacy and security of the user,and the model still has high efficiency and availability.(2)Research on training method of non-structural dataset based on GAN.This method mainly aims at non-structural datasets such as images,and uses the GAN to construct fake samples for property inference attack defense training.This method mainly includes three steps:generative model training,non-structural shadow dataset generation and defense model training.Experiments show that in the scene of non-structural dataset,the model trained by this method can resist property inference attack,protect users’ data security and ensure the robustness of machine learning model.(3)Combined with the above research results,the machine learning model security training tool set is designed and implemented,and provides property inference attack detection service and model security training service.Attack detection service helps users detect the possibility of information disclosure when the model is attacked by property inference attack.Model security training service provides users with reliable machine learning model algorithm and carries out security training of machine learning model.To sum up,this thesis studies the security model training method of structural dataset based on shadow dataset and the security model training method of non-structural dataset based on GAN,and designs and implements a set of machine learning model security training tool set based on this research result.The research work of this thesis solves the problem of user privacy disclosure caused by machine learning model,and has reference value and significance for the solutions of other machine learning security problems. |