Font Size: a A A

Membership Inference Attacks On Social Media Health Data

Posted on:2021-06-13Degree:MasterType:Thesis
Country:ChinaCandidate:Y T LiFull Text:PDF
GTID:2518306107468094Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
In recent years,with the widespread use of smartphones and tablets,the number of users of Internet social media has grown rapidly,and at the same time,huge amounts of social media data have been generated.In such an era of big data with information explosion,how to excavate the value behind massive data has become the research direction of many researchers.Compared with other data mining methods,machine learning is undoubtedly the most widely used and most mature.Combining social media with machine learning,researchers have made great progress in analyzing public opinion,disaster management and marketing.However,it is undeniable that while machine learning technology brings us great convenience,its security and privacy problems are also facing very serious challenges.This paper studies the information leakage of a single data record in the training data set of machine learning model.It mainly focuses on the problem of membership inference attack,that is,given a data record and the black box access rights of a machine learning model,the attacker needs to complete the prediction of the data ownership and judge whether the data record belongs to the training data set of the model.In this paper,an innovative Vb SDG data synthesis algorithm is proposed,which can generate the synthesizing data with the same format and similar distribution as the original training data.In addition,Gb MMC simulation model algorithm is proposed,wgan-gp network stealing target model prediction ability is introduced in this paper,so the machine learning model with similar prediction ability can be trained under black box condition.This article focuses on the performance of member reasoning attacks in social media data and discusses a possible privacy breach of social media data.In the experiment,three real data sets of IDMB Tweets and Shop and five classification models of XGBoost Logistics SVM RF and The Neural Network were used to evaluate the synthetic data algorithm,simulation model algorithm and membership inference attack.Compared with the traditional synthetic data algorithm,the proposed Vb SDG data synthesis algorithm can obtain higher quality synthetic data.Compared with other simulation model algorithms,the Gb MMC simulation model algorithm proposed in this paper can steal the prediction ability of the target model under more stringent conditions,the average similarity of the two prediction results for the test data is 84.1%,and the optimal performance of the similarity is 93.1%;for the overall membership inference attack,the optimal performance of the accuracy rate is 74% and the optimal performance of the precision rate is 86% in the test data.The experimental results prove that the proposed member inference attack is effective and accurate in the social media data set.
Keywords/Search Tags:Social media data, Membership inference attacks, Machine learning, Variational Autoencoder, Generative adversary network
PDF Full Text Request
Related items