Font Size: a A A

Research On Privacy Protection Method Of Multi-dimensional Sensitive Attribute Stream Data Publishing

Posted on:2022-02-23Degree:MasterType:Thesis
Country:ChinaCandidate:L ChengFull Text:PDF
GTID:2518306575466864Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Privacy protection is one of the most important means in the data publishing in the era of big data,effective privacy protection methods help to avoid the leakage of sensitive data.The privacy protection methods in the current data publishing are mainly divided into a privacy protection method of multi-dimensional sensitive attribute data publishing and a privacy protection method of periodic and dynamic multi-sensitive attribute data publishing.With the rapid development of big data technology,there are more and more scenarios for big data collection and real-time publishing,real-time data publishing has higher complexity and diversity,which largely limits the generalization ability and practicality of the traditional multi-sensitive attribute data publishing privacy protection model.In response to the above problems,this thesis proposes a privacy protection method for real-time streaming data publishing with multiple sensitive attributes based on a sliding window model.On the one hand,the sliding window model is used to solve the privacy protection problem in the real-time data publishing scene and ensure that the method has a low concealment rate and Information loss degree,on the other hand,by optimizing the model,the accuracy and applicability of the model in real-time data publishing scenarios are improved.The main research work of this thesis is as follows:1.A weighted optimization multi-dimensional bucket grouping algorithm based on sliding window model is proposed.First,the real-time data stream is modularized based on the sliding window model batch processing idea;then,according to the similarity value of the quasi-identifier in the data set to be published,the data set is divided into multiple groups with similar attribute values,and based on the maximum capacity priority algorithm is selected to construct a weighted optimization multi-dimensional bucket,and the data records in the data to be published are mapped to the corresponding bucket to construct a group that meets the L-diversity model;finally,the data is anonymized in turn,and the anonymous table data is published.2.A weighted optimization multi-dimensional bucket grouping algorithm based on the sensitivity value is proposed.First of all,for more accurate grouping,the sensitivity value of sensitive attributes is considered in the process of constructing weighted buckets,and the sensitivity value is set according to the user's sensitivity level to privacy attributes,and participates in the weighted calculation;then,based on the minimum data record selection priority algorithm combines the L-diversity model to map the data records to the corresponding group;finally,the quasi-identifier of the data to be published is anonymized,the data is anonymized,and the anonymous table data is published.The experimental results show that the model proposed in this thesis achieves better results in a given data set.In comparison with several commonly used privacy protection algorithms in data publishing,the model in this thesis is better than the comparison method in indicators such as concealment rate and information loss degree,and the published data also has better usability.
Keywords/Search Tags:privacy protection, data publishing, weighted bucket grouping, L-diversity, anonymity processing
PDF Full Text Request
Related items