Font Size: a A A

Research Of Classification Algorithms Under Differential Privacy Based On Mixed Noise Mechanism And Out-of-Bag Estimate

Posted on:2021-08-14Degree:MasterType:Thesis
Country:ChinaCandidate:Y H ChenFull Text:PDF
GTID:2518306497466674Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Classification algorithm under differential privacy has got much attention in the field of information security,because it can predict in the case of the protection of data privacy,and therefore has important applications in recommendation systems,transportation information protection and so on.But the differential privacy protects privacy at the same time will seriously affect the classification algorithm to predict.Therefore,in this thesis,decision tree algorithm and random forest algorithms launched a study,based on improve classification accuracy of the algorithm while protecting data privacy.First,to solve the problem that,with the low privacy budget,the conventional decision tree under differential privacy generate excessive noise,analytical Gaussian noise mechanism is introduced and used in the decision tree,and we experimentally obtained threshold of privacy budget.When the privacy budget using the below threshold privacy,analytical Gaussian noise amount smaller,reduce the amount of noise in the decision tree under differential privacy,so as to enhance the accuracy of decision tree classification,and then based on this proposed an improved decision tree under differential privacy with mixed Gaussian mechanism.Finally,we verification the feasibility and effectiveness of the algorithm by comparing the results.Then,for the problem that,the accuracy of random forest under differential privacy is undesirable,when classification high-dimensional data,we introduce the out-of-bag estimation under differential privacy in random forest,and calculate the weight of the decision tree and feature weights.And then the feature weights reduce non-critical features on the non-leaf nodes,so as to reduce noise and improve the accuracy of the decision tree.At the same time,using the weights of decision trees for lifting when integrated,so that the entire random forest classification accuracy is improved.On this basis,we present a random forest under differential privacy based on the out-of-bag estimate.The results of comparative experiments show that,compared with the existing differential privacy random forest algorithms,the algorithm has better classification results,but also has good efficiency.Finally,we bind the two points above to improve the random forest algorithm ——we use the improved decision tree under differential privacy with mixed noise mechanism as the decision tree in substantially random forest classifier,and the use the out-of-bag estimate under difference privacy to select features and lift in integration time,which made a random forest under differential privacy algorithm improvements on high-dimensional data,and finally has a better classification ability of this algorithm is verified by experiment.
Keywords/Search Tags:Differential Privacy, Decision Tree, Random Forest, Analytical Gaussian Mechanism, Out-Of-Bag estimate
PDF Full Text Request
Related items