Font Size: a A A

Research On Privacy-preserving Support Vector Machine

Posted on:2015-09-03Degree:DoctorType:Dissertation
Country:ChinaCandidate:L SunFull Text:PDF
GTID:1228330467450321Subject:Strategy and management
Abstract/Summary:PDF Full Text Request
Support vector machine (SVM) based on statistical learning theory is one of the most powerful data mining algorithms. It embodies structural risk minimization principle, and can effectively deal with classification. It applies statistical learning theory to practical applications successfully, and shows outstanding performances. SVM approach has been widely used in many real-world applications, such as text classification, handwriting recognition, image recognition and medical auxiliary diagnosis.Recently privacy preservation has aroused people’s great interest with the increasing sensitive data disclosure for commercial or legal reasons. There have been growing researches on finding solutions to get a SVM classifier without releasing private information, called privacy preserving SVM, PPSVM for short. In this work, we study in-depth about the PPSVM algorithm of vertical partitioned data for supervised and semi-supervised classification, and four improved variants are proposed as follows:1. P3SVM:in order to enhance the performance of1-norm PPSVM, we propose a new privacy-preserving proximal support vector machine (P3SVM) for classification of vertically partitioned data. It should be pointed out that our P3SVM is not a direct extension of PPSVM from1-norm SVM to proximal SVM. Meanwhile, instead of a completely random kernel, our P3SVM makes use of a global random reduced kernel composed of local reduced kernels with gaussian perturbations. This formulation leads to an extremely simple, fast and more accurate privacy-preserving classification algorithm that merely requires the solution of a single system of linear equations. In contrast, ordinary1-norm SVM needs to solve a linear program that requires considerably longer computational time.2. P3SVM-JLT:A new privacy-preserving proximal support vector machine based on the Johnson-Lindenstrauss transform, termed as P3SVM-JLT, is presented for linear classification of vertically partitioned data. Each party generates a global proximal SVM classifier by sharing local random linear kernels based on the JL transform, while the share does not disclose any private data of each party. This formulation brings forth an extremely simple, efficient and high accurate privacy-preserving classification algorithm that merely requires solving of a single system of linear equations. Furthermore, our theoretical results on JL transform for vertical partitioned data have guaranteed the promising performance of the proposed algorithm.3. VP3SVM-JLT:Note that P3SVM-JLT has the same dimension restriction, we propose a new privacy-preserving PSVM based on the JL transform with vertical partitioned style. This method constructs a new global safety kernel, not only maintains the vertical partitioned style, but also gets rid of the restrictions of same dimension, with more flexibility.4. P3S3VM:We often confront semi-supervised classification in many real applications when insufficient labeled training information is available. So, a tri-training-based privacy-preserving PSVM is presented for semi-supervised classification, termed as P3S3VM. As tri-training base classifiers, P3SVM, P3SVM-JLT and VP3SVM-JLT are respectively used to construct the classifier using both labeled and unlabeled samples. The method of P S3VM can effectively utilize potentially useful information of unlabeled data and pass it to the final classifier design in the semi-supervised learning process, and achieves better classification results.
Keywords/Search Tags:Proximal support vector machine, reduced support vector machine, privacy preservation, Johnson-Lindenstrauss transform, semi-supervised classification
PDF Full Text Request
Related items