Font Size: a A A

Research On Privacy Preserving Classification Algorithm For Horizontal Distribution Data

Posted on:2018-07-19Degree:MasterType:Thesis
Country:ChinaCandidate:R TaoFull Text:PDF
GTID:2428330596954798Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid growth of the amount of data in the information age,traditional data mining methods have been slightly inadequate in dealing with large-scale data.The application of distributed data mining is becoming more and more widely.However,the sharing of information may reveal personal sensitive information and increase the risk of privacy disclosure in the distributed environment.Privacy Preserving Data Mining(PPDM)has become an increasingly important topic in the field of data mining.This thesis presents a Privacy Preserving Data Mining framework based on SVM classifier,namely PPNL-SVM(Privacy Preserving No-Linear SVM).The framework can protect the data privacy while the horizontal distribution of data is classification mining.Specifically,the study is as follows:(1)In order to protect data privacy in distributed data mining and ensure classification accuracy and efficiency,the thesis proposes a privacy preserving non-linear SVM(PPNL-SVM)framework to construct PPDM classification model.Under the conditions of horizontally distributed data and semi-honest model of Secure Multi-party Computation,the PPNL-SVM framework is divided into three layers: the bottom layer protects the privacy of data by using the secure sum protocol and the Paillier homomorphic encryption scheme to encrypt the data center points,which selected by k-means clustering algorithm.The middle layer uses Nystrom approximation technology and matrix decomposition technique to reduce complex communication and computation.The top layer uses cutting plane technology to speed up the training process of classification model.PPNL-SVM framework does not need to rely on trusted third parties.All participants are working together on an equal footing.The security sum protocol and Paillier homomorphic encryption scheme can guarantee the security of the framework,and get the effective results of classification mining.(2)In this thesis,in order to solve the limitation of SVM classifier,the PPNL-SVM framework is extended to solve the multi-class classification problem of the horizontally distributed data.In a one-to-many scheme,multi-class classification problems are decomposed into many binary class classification problems.The PPNL-SVM framework is used to achieve the binary class classification.Because the security and efficiency of the PPNL-SVM framework is guaranteed,the security and efficiency of multi-class classification can be guaranteed as well.The experimental results show that PPNL-SVM framework can not only protect data privacy effectively and improve the performance of the classifier,it is also effectively solve the problems of multi-class classification.
Keywords/Search Tags:Distributed Data Mining, Privacy Preserving, SVM Classification, Multi-Class Classification
PDF Full Text Request
Related items