Font Size: a A A

Research On Privacy-peserving Algorithms For Association Rule Mining

Posted on:2013-04-15Degree:MasterType:Thesis
Country:ChinaCandidate:K P ZhangFull Text:PDF
GTID:2248330362474071Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of computer hardware technology、internet technology andMultimedia technology,the volume of data that the people own has reached an amazingscale,and the appearance of data mining technology let the process and analysis of themassive inventory data to obtain the knowledge hiding in those data to beachievable.But with the development and extension of the data mining technology,therisk of the personnal privacy information leakage during the mining process isincreasing higher and higher,and on this occation the Privacy-Preserving Data Mininghas become significant in this field.First of all,the paper summarizes the basic conception of the privacy-preserving datamining,the present research on this field and some related algorithm,then choose theprivacy-perserving data mining algorithm based on Association Rule Mining to be theprimary target in this research.We analyzes the MASK algorithm that based on randomizing transformation in theresearch of association rule mining.Compared with other related algorithm,thisalgorithm can obtain the relatively accurate result while preserving a higher privacyduring the mining process,but its low time efficiency limits its pratical applications.TheXMASK algorithm use a method that simplify the calculation by recurrence betweenthe near probability matrix,in order to reduce the exponential complexity of get theinverse matrix of the probability matrix during the reconstructing the original support ofsets based on the distorted database.And this algorithm is proved to having a betterperformance than the MASK.Based on XMASK’s major improvement ideological,this paper make a attempt to doa further optimization by using the features of the boolean database in the associationrule mining,in order to reduce the system costs of the algorithm more effectively.Thisimproved algorithm put all the ‘1’set’s count into a dynamic hash table during themining process,and when reconstructing original support of the n-sets,only to count theset of all the ‘1’s in the distorted database,for the other combinations’count we use theintermediate results in the hash table to calculation for,in that case we can reduce theinquires in the distorted database,and consequently promote the time efficiency of thealgorithm.The theory analysis indicate that the improved algorithm have a better timeperformance than the original MASK with some additionnal space costs.The experiment also shows the runtime efficiency of the improved algorithm is better than originalMASK and the XMASK.
Keywords/Search Tags:Data Mining, Association Rule, Privacy Preserving, Time Efficiency
PDF Full Text Request
Related items