Font Size: a A A

Research On Privacy Protection Technology Based On Randomized Response In Association Rule Mining

Posted on:2021-07-26Degree:MasterType:Thesis
Country:ChinaCandidate:X P LiuFull Text:PDF
GTID:2518306338485544Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
In recent years,under the background of rapid development of network technology and information technology,big data technology has emerged as the times require.In the process of collecting and utilizing massive data,in order to explore the value of data as much as possible,people hope to find some hidden correlation from these massive data and apply them to the actual production and life.Data mining technology can help people discover potential associations in big data.It has been widely used in business analysis and medical research.However,the widespread application of data mining technology also brings some problems.As the mining data may contain the individual's private information,it is inevitable that the individual's private information will be leaked during the mining process.Therefore,how to protect individual privacy information and avoid privacy leakage during data mining is a hot issue in the field of data mining.Random response is the mainstream perturbation mechanism of privacy protection technology based on data distortion.The association rule mining technology based on randomized response is widely concerned and studied in academic research because its moodel is simple,intuitive and easy to implement.There are two problems in the existing researches on privacy preserving data mining based on randomized response.One is that the mining data set still has a large correlation with the original real data set,which leads to unsatisfactory privacy protection effects.The other is that the traditional privacy protection association rule algorithm based on randomized response is to conduct uniform perturbation processing for the whole data set,without considering the sensitivity differences of different attributes and different sensitive values in the data set.As a result,some low-sensitive attribute values are overprotected and the availability of mining data is reduced.To solve these problems,this paper firstly proposed an improved joint disturbance algorithm based on randomized response,designed a joint disturbance strategy to reduce the correlation between the disturbance data and the original data,so that solved the problem that the mining data had a high correlation with the original real data,and effectively improved the degree of privacy protection.Secondly,a personalized privacy protection algorithm is proposed to solve the problem of low availability of mining data caused by unified perturbation processing of mining data by existing algorithms.On the basis of the conventional randomized response model,the privacy classification of different sensitive attributes and values is carried out,and different data perturbation processing is performed to improve the availability of mining data on the premise of meeting the security requirements of mining data.Finally,the paper validates the effectiveness of the proposed algorithms based on the artificial data set generator developed by IBM Almaden Research Center.The performance of the proposed algorithm is compared with the existing algorithm in terms of mining accuracy and the degree of privacy protection,so as to prove the effectiveness of the proposed algorithm.
Keywords/Search Tags:Privacy Preserving, Association Rule Mining, Randomized Response, Joint Disturbance, Privacy Classification
PDF Full Text Request
Related items