Font Size: a A A

Research On Clustering Algorithm Of Data Table Anonymity

Posted on:2018-08-22Degree:MasterType:Thesis
Country:ChinaCandidate:L J WuFull Text:PDF
GTID:2348330518998934Subject:Engineering
Abstract/Summary:PDF Full Text Request
Nowadays the Data Technology and Information Technology are growing rapidly in our modern society,the Internet has currently become the main platform of the scientific researches,business negotiations,Information interchanges and the Data sharing.The fact that increasingly information is shared on the Internet,thousands of intimate information is leaked at the same time.Every coin has two side,as well as the Internet resource sharing: on the one hand,many research institutes want the relative Data with the minimum costs,on the other hand,the individuals are not willing to see these Data leaked out in the Internet.To balance these two sides and to solve the information leak problem become a vital research direction in the field of privacy protection.The goal of Privacy Protection in Data distributions is to minimize the risk of privacy leaking while maximizing Data availability.To achieve this goal,numerous of experts and scholars started a series of studies about privacy leaking in the Internet.The use of anonymity method to achieve privacy protection is proposed in the early stage.And the k-anonymous rules of anonymous technology researching are very extensive and mature.Based on the k-anonymous rule,this paper seeks to improve the algorithm based on the classical fixed-length micro-aggregation algorithm MDAV,hoping to get a better performing algorithm.The main focus of this dissertation includes the following two parts:(1)The classical fixed-length micro-aggregation algorithm MDAV has outstanding computational efficiency,but the data are easily attacked by homogeneity after the Element Clustering,because the elements are extremely similar under the equivalence class.However,l-diversity rules have better abilities to prevent the attacks from the homogeneity.In this article,a new(l,d,e)-MDAV algorithm is proposed by setting the MDAV algorithm satisfy the l-diversity rules and introducing the attribute differentiation e and the parameter d.After that the anonymous cluster not only has differentiation in Element but also has d attribute value which the Difference Value must greater than or equal to e Element.Experimental results prove that(l,d,e)-MDAV could maintain the high efficiency of MDAV,as well as reduce the risks of leakage effectively.(2)As the(l,d,e)-MDAV algorithm is improved by the k-anonymous rules,so there is a problem that the algorithm is time consuming and inefficient while computing the complex data sets.As a result,introduce k-means is in necessary,an MLDM-anonymity algorithm is proposed.When dealing with large data sets,it divides these sets into small pieces,and then the(l,d,e)-MDAV will operates each small data sets anonymously,finally,it will be merging the small pieces into one large data set.The simulation results demonstrate that the MLDM has better algorithm performance on dealing with large anonymous Element set.All in all,this paper discusses the Data Release of Privacy Protection,and raise two improved algorithms to improve the capacity of protecting intimate information.
Keywords/Search Tags:Privacy Preservation, k-anonymity, l-diversity, Microaggergation
PDF Full Text Request
Related items