Font Size: a A A

Research On Privacy Preserving Technology For Distributed Support Vector Machines

Posted on:2022-01-31Degree:MasterType:Thesis
Country:ChinaCandidate:Z J HuangFull Text:PDF
GTID:2518306605967789Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Support vector machine(SVM)is a widely used supervised learning method,which is mainly used to construct a generalized linear classifier for binary data classification.It has been popularized in text classification,handwritten character recognition and bioinformatics.Traditional SVMs rely on a centralized server for gathering raw data and training the classifier.Due to the explosion of data,such a centralized server becomes impractical.In the real scene,there is a problem of a huge training sample set.The hardware of a single machine cannot support the requirements for direct operations on all data,and it is difficult to guarantee the accuracy of the classification results of the algorithm.In addition,there are also scenarios where data sets are deployed in a distributed manner,and collaborative training between different entities will bring privacy and security issues.So there is an urgent need to train an SVM classifier in a distributed and privacy-preserving manner via the collaboration of multiple data holders.At present,solving this type of machine learning problem is generally realized by the alternating direction multiplier method(ADMM).The ADMM algorithm divides a large problem into multiple small problems that can be solved at the same time in a distributed manner,and obtains the final optimal solution by participating in the parameter interaction between entities.However,simply saving the original data locally is not enough for privacy protection,and it is necessary to protect the privacy of the interactive parameters in the process of implementing the ADMM algorithm.To this end,we need to propose an new ADMM algorithm combining secure multi-party computing technology which can achieve privacy protection.This technology allows multiple participating subjects to perform collaborative calculations in scenarios where they do not trust each other,and output the calculation results.As a result,it is guaranteed that each subject can only get the final classification result,but not the status information of other subjects.Therefore,in order to protect the privacy of the intermediate state during the support vector machine training process,this thesis focus on the scheme of federated support vector machines based on privacy preserving ADMM,combining secure multi-party computing and secret sharing.And the scheme is recorded as FSVM.The main contributions are as follows:(1)The system model for the proposed FSVM scheme is presented,where the general optimization problems for the cases of data partitioning by examples and features are offered for deriving SVM classifiers.(2)In FSVM-C,one existing ADMM that allows time-varying matrices is incorporated with secret sharing to achieve the privacy-preserving goal in a more efficient manner.(3)With respect to FSVM-S,the closed form of the ADMM iterations that converge to the optimal solution to construct the SVM classifier is derived,and the Shamir secret sharing is introduced to preserve the privacy of the enrolled participants.(4)By implementing the FSVM scheme on the real-word dataset MNIST and the dataset Breast-cancer,the efficiency and effectiveness of both FSVM-S and FSVM-C are verified by comprehensive experimental results.
Keywords/Search Tags:ADMM, Privacy Preserving, Distributed Support Vector Machines, Secure Multi-party Computation
PDF Full Text Request
Related items