Font Size: a A A

Research Of Mining Algorithms Based On Privacy Preserving Association Rule

Posted on:2019-01-02Degree:MasterType:Thesis
Country:ChinaCandidate:Y YinFull Text:PDF
GTID:2428330566974045Subject:Computer technology
Abstract/Summary:PDF Full Text Request
To answer the call of"big data era",data mining applications are widely used in social life,business operations,technical research and other fields.The collection and management of mass data are powerfully supported by the rapid development of data storage technology and popularization of"Internet plus".The hidden knowledge in big data,which has social and business value,can be found through data mining.However,users'privacy and data security are threatened by the sensitive data in object data set.Therefore,how to integrate privacy protection and data mining to achieve a win-win situation on sensitive data protection and accurate mining,which has become a popular research direction in data mining.This paper studies on mining algorithms based on privacy protection association rule mining,especially the AOPAM algorithm under centralized distributed datasets and the improvements on it.Firstly,the relevant background knowledge and traditional typical algorithms are introduced and analyzed in this paper.The related assessment criteria in this field are proposed in succession.Secondly,the AOPAM method,which is based on partially hidden transition probability matrix,is mainly focused.Since the good protection performance of AOPAM is at the cost of valuable time,two modified strategies are proposed in this paper to improve the method.One of the both is to simplify transition probability matrix inversion by recursion and divide-and-conquer methods,the other is to speed up the enumeration of item sets by set operation principles.The time efficiency can be improved by accelerating the construction of frequent item sets.Finally,experiments show that the improved algorithm has a good effect in time efficiency.The main contributions in this paper are as follows:(1)Improve the transition probability matrix inversion.During the process of refactoring support of item sets,the original algorithm needs to inverse 2~N×2~N order matrix by way of elementary transformation,which results in high time complexity and low efficiency.To address this issue,a modified method based on recursion and divide-and-conquer strategy is proposed.It can solve inverse matrix of N item sets according to N-1 item sets by recurrence and avoid the complex elementary transformation of each matrix.Therefore,this improved method can be more efficient during the inversion process of high-order matrix.(2)Improve the counting process of item sets support.The original algorithm needs to scan database frequently and hold loop count for 2~N kinds of N order matrix,which is a tedious long process.To address this issue,a modified method based on set inclusion-exclusion principles is proposed.Based on KuU,the proposed method cuts down the scan times on database to N times and simplify the counting process of item sets support,which makes it better in run time efficiency.(3)Experimental tests.The improved method is compared and analyzed in terms of efficiency,privacy and identity error.Experiments prove the effectiveness of the improved method.
Keywords/Search Tags:data mining, association rule, privacy preserving, time efficiency
PDF Full Text Request
Related items