Font Size: a A A

Research On Microaggregation Of Density Clustering For Privacy Preserving Based On Grey Relational Analysis

Posted on:2015-08-23Degree:MasterType:Thesis
Country:ChinaCandidate:J J LiaoFull Text:PDF
GTID:2308330461974627Subject:Information management and information systems
Abstract/Summary:PDF Full Text Request
With the development of data mining technology and its wide application in human life, privacy disclosure problem due to data information publishing and sharing is becoming more and more serious, and even affect some people’s lives, at the same time, with the improvement of attention to personal privacy,data mining technology in the data analysis and publication continuous facing new difficulties.Thus,privacy preserving algorithm has become a hot research issue as same as the data mining technology.Among the privacy preserving algorithm, k-Anonymous technical require a tuple with sensitive attributes record of data in the table, corresponding to the anonymous data in the table the same quasi marking records shall be at least k number,thus an attacker in the K records will be very difficult to distinguish the sensitive record belongs to the individuals accurately, the technology can reduce privacy disclosure risk in a certain degree. In recent years,k-anonymous privacy preserving algorithm, microaggregation technique has been applied to the k-anonymous of data sets, and get a better privacy preserving effection, the basic idea of microaggregation is divide the data into different groups according to the similarity degree between tuples of the data sets, each group include at least k tuples, then calculated the group-centroid, and all tuples in the group are replaced by group-centroids records, forming a plurality of equivalence groups, and realizes the data sets k-anonymous.The idea of the study about the microaggregation algorithm described as flows.first of all, apply the density-based clustering algorithm DBSCAN to the microaggregation combine with optimal k-partition conditions, and secondly, use the grey relational degree to measure the similarity between the tuples of the data sets, then, according to the nature of grey relational degree, create the grey correlation degree density index as the weights, and apply it to calculate the group-centroid. the specific contents are as follows:First of all. the density clustering DBSCAN combined with the optimal k-partition conditions applied in microaggregation, a microaggregation algorithm based on density clustering formed(DBAV).Secondly, using the grey relational degree to measure the similarity between tuples instead of the Euclidean distance metric, aims to improve the density-based clustering microaggregation DBAV algorithm, and get a new algorithm based on grey relational analysis (GDBAV).Then, in the centroid process, considering the kind of dispersion of the sensitive attribute values, calculated the weight of each tuple in the group. Based on the grey relational analysis of grey correlation degree, calculate the density index as the weight of each tuple, calculate the group centroid according to the recording of tuples and the corresponding weights. The weighted based group-centroid calculation method used to improved the microaggregation algorithm of density clustering based on grey relational analysis, and formed a new microaggregation algorithm of density clustering based on grey relational analysis——(k, e)-GDBAV.Next, experiments are carried out to validate the model and algorithm proposed above anonymous.Analyzing the effectiveness of the experimental model and algorithm by analyses the degree of information loss and privacy leakage anonymous risk of data set.Finally, summarizes the results of this study, and points out the further researches.
Keywords/Search Tags:privacy preserving, k-anonymity, DBSCAN, grey relational analysis, microaggregation
PDF Full Text Request
Related items