Font Size: a A A

Research On Classification And Recommendation Algorithms Based On Differential Privacy

Posted on:2022-03-07Degree:MasterType:Thesis
Country:ChinaCandidate:D Z SunFull Text:PDF
GTID:2518306731997809Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In the era of big data,various information collection tools digitize everyone.In the vast information network,some of the behaviors and characteristics of people can be represented by data.While various applications based on data bring people a better service experience,they also have serious privacy risks.The privacy leak incident stimulated people's nerves.In fact,the issue of privacy security has become a hidden danger in social development.In order to create a healthy environment,we should continuously strengthen the protection of personal privacy.Privacy protection computing can achieve a balance between privacy and benefits,and resolve the contradiction between data value mining and privacy protection.This article analyzes the commonly used privacy protection calculation methods,and on this basis leads to the differential privacy technology.In the field of privacy protection,differential privacy is an advanced theoretical framework.It can provide a provable and quantifiable privacy protection mechanism.Therefore,differential privacy has become a hot spot in the research of privacy protection technology.Aiming at the privacy protection problems in classification and recommendation tasks,this thesis studies how to apply differential privacy technology to the corresponding algorithms from the aspects of privacy protection points,implementation mechanisms and privacy budget allocation strategies.In order to achieve an effective balance between classification accuracy and data privacy in classification,this thesis apply differential privacy to the random forest algorithm;For the recommendation algorithm,this thesis first proposes an algorithm based on explicit and implicit feedback,and then applies differential privacy to the algorithm.The purpose is to achieve effective recommendation under the premise of ensuring data privacy.The main work of our work is summarized as follows:(1)Aiming at the privacy security issues in classification algorithms,this thesis first analyzes the privacy protection points of decision trees,and then proposes a random forest algorithm based on differential privacy protection.In order to achieve a dynamic balance between privacy budget and signal-to-noise ratio,when allocating privacy budgets,the algorithm takes into account the relevance of the data set and the size of the data volume.When building a privacy protection decision tree,the algorithm uses the information gain of the branch node as a quality scoring function,and uses an exponential mechanism to achieve privacy protection;for leaf nodes,the Laplace mechanism is used to perturb to achieve privacy protection.In order to improve the accuracy and stability of the algorithm,the algorithm adopts the Bagging strategy to construct multiple privacy protection decision trees and integrate them.Finally,the privacy analysis of the algorithm is given,and the experiment proves that the algorithm can guarantee the accuracy of classification under the premise of satisfying differential privacy protection.(2)Aiming at the collaborative filtering recommendation algorithm,this thesis proposes a new algorithm based on the fusion of explicit data and implicit data.The algorithm has two steps.For implicit data,the first step is to construct an implicit feedback training data set through data conversion and collecting negative samples,and on this basis,the implicit feature vector of users and items can be obtained through the training of model.The second step is the fusion of explicit and implicit feedback data.This step first creates the explicit feedback training data,and then the implicit feature vectors obtained in the first step are integrated into the explicit data solution model to obtain model parameters through training.Experiments show that the algorithm can effectively improve the accuracy of recommendation.(3)Aiming at the privacy security problem in the recommendation algorithm,this thesis combines the differential privacy protection technology with the algorithm proposed in Work 2,and proposes that the algorithm meets the differential privacy protection by adding mean perturbation and gradient perturbation in the solution process.Experimental results show that the algorithm can achieve differential privacy protection at the cost of less loss of recommendation accuracy.
Keywords/Search Tags:Privacy security, Privacy protection calculation, Differential Privacy, Random Forest, Explicit Feedback, Implicit Feedback, Collaborative Filtering, Matrix Factorization
PDF Full Text Request
Related items