Collaborative Classification Based On Statistical Learning And Its Application To Privacy-preserving

Posted on:2012-08-23

Degree:Doctor

Type:Dissertation

Country:China

Candidate:Z C Zhang

Full Text:PDF

GTID:1118330368489482

Subject:Light Industry Information Technology and Engineering

Abstract/Summary:

PDF Full Text Request

Pattern classification often needs to handle various kinds of patterns, which means thatit is di?cult to build an e?ective classifier. There are both local and global features forevery matter and there are connections and di?erences between the local and the global fortheir accessibility, availability and accuracy. Recently, how to collaboratively utilize bothlocal and global information is a important research topic.On the other hand, pattern classification task often faces with those data which aregenerally distributed in di?erent parties. Traditional classifiers deal with the data underthe assumption that all parties'data can be free accessed and centralized at the data center.Nowadays, privacy concerns may prevent the parties from directly sharing those data. Howto e?ectively train and classify without disclosure of privacy has became an active researchtopic.Motivated by the two topics, collaboratively local and global learning and its applica-tions in privacy-preserving are studied, the whole thesis involves the following three part.Firstly, on the study of local and global learning, three classification machines areproposed, i.e., (a)A novel large margin classifier called the Collaborative ClassificationMachine with Local and Global Information(C2M) that inspired from covariance matrixstating data direction globally. This model divides the whole global data into two indepen-dent models, and the final decision boundary is obtained by collaboratively combining thetwo hyperplanes learned from the two independent models. The proposed C2M model canbe individually solved as a Quadratic Programming(QP) problem. For a training datasetwith N samples, the total training time complexity is O(N3) which is faster than O(N4) ofthe existing Maxi-Min Margin Machine(M4). We also provide the geometrical interpreta-tion. We show that C2M can robustly utilize the global information from those data setswith overlapping class margins, where M4 loses the global information. We also exploitkernelization trick and extend C2M to its nonlinear version. Moreover, we show that C2Mcan be transformed into standard Support Vector Machine(SVM) model and can be solvedby other speed-up algorithm widely used by SVM. We also propose four indices to nu-merically evaluate the global covariance matrix's contribution to a classifier. (b)For handlethe classification task where there are plenty of normal examples and very few abnormalexamples, a Covariance Preserving Classifier for Novelty Detection (CP-ND) classifier isproposed, in this model, the covariance of normal examples is applied to preserve the sta- tistical distribution of normal data and the margin between the decision hyperplane andabnormal points is maximized, also, the dual problem of this model can be solved as a QPproblem. The three parameters ofÎ½,Î½1 andÎ½2 introduced by this classifier can be usedto tune the rate of training misclassification and the rate of support vectors. (c)Inspiredfrom the typical local and global learning machine Maxi-Min Margin Machine (M4) andthe idea of the Locality Preserving Projections (LPP), we propose a novel large marginclassifier, the Generalized Locality Preserving Maxi-Min Margin Machine (GLPM), wherethe within-class matrices are constructed using the labeled training points in a supervisedway, and then used to build the classifier. The within-class matrices of GLPM preserve theintra-class manifold in the training sets, as well as the covariance matrices which indicatethe global projection direction in the M4 model. Moreover, the connections among GLPM,M4 and LDA are theoretically analyzed.Secondly, we focus on the problem of seeding up Support Vectors (SVs) based decisionfunction. The less SVs means the more sparseness and higher classification speed. Basingon the sparsity of SVs, we prove that when clustering original SVs, the minimal upperbound of the error between the original decision function and the fast decision function canbe achieved by K-means clustering the original SVs in input space, then a new algorithmcalled Fast Decision Algorithm of Support Vector Machine(FD-SVM) is proposed, whichemploys K-means to cluster a dense SVs set to a sparse set and the cluster centers are usedas the new SVs, then aiming to minimize the classification gap between SVM and FD-SVM,a Quadratic Programming model is built for obtaining the optimal coe?cients of the newsparse SVs.Finally, inspired from mean value and covariance matrix globally stating data loca-tion and direction, and the fact that sharing those global information with others will notdisclose ones own privacy, we propose a novel two-party privacy-preserving classifica-tion solution called Collaborative Classification Mechanism for Labely Distributed Privacy-preserving(LP2M). This model collaboratively trains the decision boundary from two hyper-planes individually constructed by ones own privacy information and counter-party's globalinformation. We also show that LP2M can be transformed into existing Minimax Probabil-ity Machine(MPM), SVM and M4 model when privacy data satisfies certain conditions.We also proposed secure training and test algorithms. Moreover, for handle more commonhorizontally partitioned data, LP2M is extended to HP2M.

Keywords/Search Tags:

Pattern classification, Global and local learning, Collaborative learning, Localpreserving, Fast classification, Privacy preserving

PDF Full Text Request

Related items

1	On Extreme Learning Machine For Preserving Privacies
2	Local Learning And Global Preserving Based Semi-supervised Algorithm For Large Scale Classification Problems
3	Research On Privacy Preserving Classification Algorithm For Horizontal Distribution Data
4	Local Model Privacy-Preserving Study For Federated Learning
5	Research On Semi-Supervised Classification Based On Local Learning
6	Research On Privacy-Preserving Classification Service Query Mechanism For SVM
7	Integration Of Global And Local Features Of The Medical Image Classification
8	Research On Key Technologies Of Privacy-preserving Machine Learning Based On Homomorphic Encryption
9	Research On One-class Classification Algorithm Based On Projection Subspace
10	Research On Differential Privacy Preserving Technologies For Collaborative Filtering