Font Size: a A A

A Study Of Ensemble Classification Algorithms And Applications Based On Differential Privacy

Posted on:2022-02-25Degree:MasterType:Thesis
Country:ChinaCandidate:W Y QiuFull Text:PDF
GTID:2518306500455984Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
The rapid development of big data and cloud computing is the key factor for machine learning success in artificial intelligence and big data analytics.Machine learning has been widely used in medical diagnostics,personalized recommendations and other service areas,but data set usually contains private or sensitive information,such as medical diagnostic information,e-commerce shopping information.Because the machine-learning model itself can reveal individual information or sensitive attributes in the training data,such as member inference attacks,model theft and model inverse attacks,the issue of privacy protection in machine learning has become one of the hot topics in the field of information security.Aiming at the issue of personal privacy disclosure when data models are published and analyzed in machine learning,this paper aims to study the model publishing mechanism and application of AdaBoost ensemble classification algorithm under the constraints of differential privacy protection.The differential privacy mechanism is introduced in the training of the ensemble learning model,which protects the individual information of the training data while taking into account the availability of the ensemble model.The main work of this paper is as follows:(1)By studying the ensemble learning model of improving the CART classification tree under differential privacy constraints,the AdaBoost ensemble classification algorithm CART-DPs AdaBoost based on differential privacy is proposed.The algorithm introduces Laplace noise mechanism in ensemble learning,which makes the model training process meet the differential privacy protection.In order to protect data privacy and ensure the validity of the data model,two random schemes of sample perturbation and characteristic perturbation were introduced.The index mechanism and Gini index were used to process the data features after perturbation to construct the CART lifting tree,and the Adaboost ensemble model based on decision tree was established under differential privacy protection.The influence of tree depth on the privacy model is further studied,and the influence of different privacy protection levels on the classification performance of the ensemble model is analyzed.Experimental results show that the proposed scheme has good classification accuracy while considering the privacy and availability of the model.(2)By studying the application of ensemble classification model in personalized recommendation under the protection of differential privacy,an ensemble classification recommendation model satisfying differential privacy was proposed to protect the individual information characteristics of users from being disclosed by the recommendation model for the classification recommendation of the tag data set with high dimensional characteristics.To reflect the personalized differences of users,the feature optimization strategy was introduced into ensemble learning to extract effective features and construct feature groups to depict user portraits.The number of features was optimized by the Gini index,and then the Relief F algorithm was used to extract the optimal features to construct the differentiated feature group.Finally,the CART lifting tree after Laplace disturbance was constructed in each group,and the label category of user preference was predicted by Adaboost ensemble learning.Experimental results verify the effectiveness of the proposed differential privacy recommendation model.
Keywords/Search Tags:Differential Privacy, Machine Learning, AdaBoost, CART Classification Tree, Personalized Recommendation
PDF Full Text Request
Related items