Research On Privacy-preserving Classification Technology Based On Differential Privacy

Posted on:2023-12-15

Degree:Master

Type:Thesis

Country:China

Candidate:L W Guo

Full Text:PDF

GTID:2556307061453784

Subject:Computer Science and Technology

Abstract/Summary:

With the convenience of data sharing,data leakage incidents have occurred frequently in recent years,and society pay more and more attention to data security.China successively enacted Personal Information Protection Law and Data Security Law in 2021.In the application scenario of big data mining,the problems of data security are prominent,and privacy protection in data mining has become a research hotspot.Classification is an important function in data mining,which can predict events and states that have not happened yet based on historical data and support enterprise decision-making further.Classification has been widely applied in big data scenarios.To overcome the shortcoming that the classification accuracy is insufficient of the existing privacy-preserving decision tree construction methods and logical regression mining methods,the privacy-preserving classification technology based on differential privacy is improved,after which data security will be protected and the classification accuracy will be maintained at the same time.The main work of the thesis includes:(1)To solve the problem that the existing methods add differential noise to the query count values,resulting in the poor availability of the count values and the poor classification accuracy of constructed decision trees,the privacy-preserving decision tree construction method based on differential privacy DP-DTC is proposed.The classification gain matrix is designed to store the count values required to calculate the information gain and its perturbation method based on differential privacy is proposed to protect individual data privacy.The reconstruction method of the perturbed classification gain matrix is designed according to the consistency constraint to maintain the distribution of category labels.The reuse scheme of count values is designed to avoid redundant queries,which can increase the privacy budget of a single query.The adaptive privacy budget dividing method ADP-BD is designed to solve the problem of the count values in deeper levels having lower signal-to-noise-ratios.At last the classification accuracy of the constructed decision tree is enhanced.(2)Aiming at the problem that the privacy-preserving training data are not sufficiently accurate to support logistic regression modeling in the data sharing scenario,the data generation model LRDG based on Generative Adversarial Network is constructed,in which the generator network weights are constrained by the average distance between data groups to maintain the classification accuracy of the logistic regression model trained on the generated data.The LRDG generators perturbation method based on differential privacy is designed to protect individual data privacy.The differential privacy data releasing method oriented to logistic regression DP-LRDR is proposed based on LRDG at last to achieve differential-privacy-preserving logistic regression modeling.Theoretical analysis and experimental results show that the proposed method can maintain classification accuracy while protecting individual data privacy.

Keywords/Search Tags:

Privacy-preserving, Differential privacy, Classification, Decision Tree, Logistic Regression, Generative Adversarial Network

Related items

1	Research And Implementation Of Privacy Preserving Algorithm For Government Big Data Based On Network Representation Learning
2	Research And Implementation Of Entity Recognition And Privacy Preserving Technology For Government Affairs Text Information
3	A Study On The Influencing Factors Of Marriage And Love Intention Of Generation Z Young People In Shijiazhuang City
4	Reaserch On Credit Card Fraud Unbalanced Classification Based On Generative Adversarial Nets
5	Research On Privacy-preserving Hybrid Data Ambiguity
6	A cross-country examination of online privacy issues: From an adversarial paradigm toward a situational paradigm. A comparison of regulations, net users' concerns and practices, and Web sites' privacy statements in China, the Netherlands, Taiwan, and th
7	A Privacy-preserving Online Double Auction For Spectrum Allocation
8	Research On Restoration Technology Of Simulated Sketch Portraits Based On Deep Generative Adversarial Networks
9	Research And Implementation Of Privacy Preserving Methods For Government Data Sharing
10	Research On The Expected Old-age Care Mode And Its Influencing Factors Of The Elderly In China