Font Size: a A A

Research On The Differential Privacy Classification Algorithms

Posted on:2018-01-30Degree:MasterType:Thesis
Country:ChinaCandidate:S Q ShenFull Text:PDF
GTID:2348330536487945Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Nowadays people pay more attention to the privacy security because of the increasing exchange and sharing of information.Usually data mining algorithms only focus on the extraction of useful information,while the privacy protection of information is always ignored.Therefore,the combination between differential privacy protection and data mining has profound significance.The differential privacy classification algorithms are mainly studied.And especially for the incomplete data sets,some special methods dealing with missing values are proposed.Firstly,the problem of constructing decision tree classifier with differential privacy protection is studied.A new differentially private random decision tree algorithm with exponential mechanism is created,considering the weakness of the existing algorithms.For incomplete data sets,using dynamic weight update,a method called WP(Weight Partition)is proposed for ID3 algorithm and random forest decision tree classification algorithm to deal with the missing values.The experimental results show that the proposed method can improve the accuracy of the difference privacy classification algorithm and its practicability,providing the same differential privacy protection.Then the problem of applying differential privacy into Adaboost classifier algorithm with complete data sets is studied.During the structure of the weak classifiers,noise of the differential privacy protection is added to the algorithm with the budget pre-distributed,and a DP-Adaboost algorithm is implemented.The experimental results show that in complete data sets,comparing to differential privacy ID3 decision tree algorithm and differential privacy random decision tree algorithm,the classification results of differential privacy Adaboost algorithm are improved.Finally,an upgrade of DP-Adaboost algorithm is studied,so that it can be adapted to incomplete data sets.For incomplete data sets,the privacy weight for each weak classifier is increased and in the process of adding differential privacy noise,the differential privacy sensitivity is dynamically changed.Then the differential privacy protection Adaboost algorithm dealing with missing values is implemented.The experimental results show that the upgraded DP-Adaboost algorithm has higher classifier accuracy and suits for more data sets.
Keywords/Search Tags:Differential privacy, ID3 decision tree algorithm, Random decision tree algorithm, Adaboost classifier algorithm, Incomplete data sets
PDF Full Text Request
Related items