Font Size: a A A

Research On Classification Method Of Providing Differential Privacy Protection

Posted on:2018-08-13Degree:MasterType:Thesis
Country:ChinaCandidate:F J SunFull Text:PDF
GTID:2348330542490838Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Large data age,how to dig out from the data valuable information is the focus of data researchers research.Which classification is an important field of data mining,support vector machine classification method is one of the most widely used methods.Although the support vector machine classification method has a solid theoretical basis and a novel small sample learning method,it is one of the most efficient classification methods,but the traditional support vector machine method still has some limitations.First,the classification is too rough,generally according to the property will be measured target two points,but no quantitative analysis of the target to be measured,can not deal with the need to subdivide the data.Second,do not take into account the risk of leakage of the sample in the training set,when the training set contains sensitive attributes,can not protect the privacy of the training set.This can lead to two consequences: First,the classification model can not be compared between the number of different targets to be measured.Second,the leakage of training set will not only cause economic and property losses,but also hinder the development of research work.In this paper,the support vector machine classification method is studied in view of the above problems.The main research contents are as follows:In this paper,the support vector machine classification method is combined with the differential privacy model for the hidden problems of training samples.For the different situations,two classification methods are proposed,which are MDP-SVM classification method and ODP-SVM classification method respectively.The MDP-SVM classification method achieves the purpose of protecting the privacy of the training set by adding the perturbation method to the classification model.This method is more efficient when the number of samples is low.The ODP-SVM classification method protects the sample by adding the disturbance The purpose of training set of privacy,this method is more available when the number of samples is higher.In addition,the two classification methods also define a scoring function according to the classification model.Through this scoring function,we can judge the category and classification confidence of the target to be measured,and can compare the differences between the different categories of the target Break down the data.The privacy and usability of the MDP-SVM classification method and the ODP-SVM classification method are proved theoretically.It is proved that these two classification methods still maintain high availability under the premise that the problem of privacy leakage of training samples can be solved.The MDP-SVM classification method and the ODP-SVM classification method proposed in this paper are compared with the existing SVM classification method through the real data set.Verify the validity of the MDP-SVM classification method and the ODP-SVM classification method.And compare the advantages of both approaches.
Keywords/Search Tags:differential privacy, data mining, classification, support vector machine
PDF Full Text Request
Related items