Analyzing And Researching Based On Privacy-Preserving Clustering

Posted on:2009-07-18

Degree:Master

Type:Thesis

Country:China

Candidate:D J Jiang

Full Text:PDF

GTID:2178360275951026

Subject:Computer application technology

Abstract/Summary:

Privacy preserving is an important direction in data mining.The purpose is to get mining results using data mining tools at the premise of disclosuring private data.Researchers have propose some algorithms for privacy preserving,these algorithms include both horizontally and vertically partitioned data,but only few algorithms on privacy preserving clustering,and these algorithms with high complexity and low efficiency,mostly based on k-means,based on two parties or depend on a trusted third party.The security and reliability meet with the needs of users difficultly.The methods of privacy preserving data mining include secure multi-party computation and data distortion.Secure multi-party computation constructs secure multi-party algorithms based on secure protocols,the aim of the method is to construct secure and efficient protocols;Data distortion conceal true data by changing distribution of initial data.This paper use secure multi-party computation and data distortion to overcome the drawbacks mentioned above.The main contributions can be summarized as follows:(1) Presents hierarchical-k-means clustering algorithm by combining the algorithms hierarchical and k-means clustering,and overcome the drawbacks of k-means which choosing initial clustering centers randomly.(2) Presents privacy preserving hierarchical-k-means clustering algorithm on horizontally partitioned data for semi-honest party.Constructing some secure ptotocols:secure distance,secure clustering center,secure comprison and standardization,the third party and data parties solve the problem of privacy preserving hierarchical-k-means clustering for semi-honest party.Theoric argument and example analysis demonstrate that the algorithm is effective.(3) Constructining secure comprison protocol,and presents privacy preserving DBSCAN on horizontally partitioned data for semi-honest model,a third party and data parties solve the problem of privacy preserving DBSCAN together for semi-honest party.Theoric argument and example analysis demonstrate that the algorithm is effective.(4) Applying random distortion based on orthogonal transformation to the computation of inner dot between attributes and hierarchicaI-k-means clustering between data objects.Reducing error by choosing random matrix which meeting some conditions,Theoric argument and experiment demonstrate that the method can protect privacy effectively,and limit the precision of mining between before and after distortion in a small range.

Keywords/Search Tags:

data mining, privacy preserving, clutering, semi-honest model, malicious model, secure multi-party computation, data distortion

Related items

1	Research Of Privacy Preserving Outlier Detection
2	Research On Outsourcing Privacy- Preserving Data Classification Method
3	Research On Applications And Protocols Of Secure Multi-party Computation
4	Research On Several Secure Multi-Party Computation Problems And Applications
5	Research On Data Privacy Preservation Technologies Using Secure Multi-Party Computation
6	Study On Privacy Preserving Classification Data Mining
7	Research On Privacy-Preserving Data Mining Algorithms
8	Reseaches On Privacy-Preserving Data Aggregation And Secure Two-Party Computation
9	Research On Distributed Data Mining Based On Privacy Preserving
10	Using Secure Enclaves For Efficient Multi-party Computation