Font Size: a A A

Analyzing And Researching Based On Privacy-Preserving Clustering

Posted on:2009-07-18Degree:MasterType:Thesis
Country:ChinaCandidate:D J JiangFull Text:PDF
GTID:2178360275951026Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Privacy preserving is an important direction in data mining.The purpose is to get mining results using data mining tools at the premise of disclosuring private data.Researchers have propose some algorithms for privacy preserving,these algorithms include both horizontally and vertically partitioned data,but only few algorithms on privacy preserving clustering,and these algorithms with high complexity and low efficiency,mostly based on k-means,based on two parties or depend on a trusted third party.The security and reliability meet with the needs of users difficultly.The methods of privacy preserving data mining include secure multi-party computation and data distortion.Secure multi-party computation constructs secure multi-party algorithms based on secure protocols,the aim of the method is to construct secure and efficient protocols;Data distortion conceal true data by changing distribution of initial data.This paper use secure multi-party computation and data distortion to overcome the drawbacks mentioned above.The main contributions can be summarized as follows:(1) Presents hierarchical-k-means clustering algorithm by combining the algorithms hierarchical and k-means clustering,and overcome the drawbacks of k-means which choosing initial clustering centers randomly.(2) Presents privacy preserving hierarchical-k-means clustering algorithm on horizontally partitioned data for semi-honest party.Constructing some secure ptotocols:secure distance,secure clustering center,secure comprison and standardization,the third party and data parties solve the problem of privacy preserving hierarchical-k-means clustering for semi-honest party.Theoric argument and example analysis demonstrate that the algorithm is effective.(3) Constructining secure comprison protocol,and presents privacy preserving DBSCAN on horizontally partitioned data for semi-honest model,a third party and data parties solve the problem of privacy preserving DBSCAN together for semi-honest party.Theoric argument and example analysis demonstrate that the algorithm is effective.(4) Applying random distortion based on orthogonal transformation to the computation of inner dot between attributes and hierarchicaI-k-means clustering between data objects.Reducing error by choosing random matrix which meeting some conditions,Theoric argument and experiment demonstrate that the method can protect privacy effectively,and limit the precision of mining between before and after distortion in a small range.
Keywords/Search Tags:data mining, privacy preserving, clutering, semi-honest model, malicious model, secure multi-party computation, data distortion
PDF Full Text Request
Related items