Data Collection And Statistics With Local Differential Privacy Protection

Posted on:2021-03-25

Degree:Master

Type:Thesis

Country:China

Candidate:L Shu

Full Text:PDF

GTID:2428330605456879

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

In order to deal with the leakage of user privacy information,many scholars and institutions have proposed different privacy protection technologies.Traditional anonymized privacy protection technology has been difficult to meet users' needs for their own privacy protection.At present,differential privacy protection technology is recognized as the most rigorous and effective privacy protection technology.In order to achieve the goal of third-party users obtaining their personal information,local differential privacy ensures the security of user data on the user side,and has become the most popular research field.This article has made a detailed description of the privacy protection system,anonymized privacy protection technology,centralized differential privacy,local differential privacy and related theoretical concepts.In addition,the process and principle of RAPPOR algorithm based on local differential privacy are studied more deeply,and its shortcomings are found,and an improved algorithm is proposed.In the RAPPOR algorithm,the user terminal converts the data according to the Bloom filter principle,then uses the random response technology to disturb,and finally decodes,corrects,and counts the frequency at the data collection terminal.Because the frequency difference of each attribute value of the data is large,the frequency error of each attribute value collected by the RAPPOR algorithm is large,and even some low-frequency attribute values are lost.In addition,the pass-through cost passed after the user data conversion is also greater,and the regression calculation in the algorithm also increases the error.Aiming at the shortcomings of RAPPOR algorithm,this paper uses K-means clustering algorithm to classify data attribute values,and then collect statistics according to the corresponding categories.Then,the data lossless compression method is used to perform lossless compression on the user-converted 0 and 1 bit vectors to reduce the communication cost.In addition,we adjusted the algorithm process so that no regression calculation was required in the later stage.Finally,the Adult data set in the UCI database is used for experiments.According to the KL-divergence and cosine similarity,the availability of the data is compared under different numbers of grouped data sets and different privacy budgets,and then displayed according to the compression rate.The degree of compression of the communication cost.It can be seen from the experimental results that the improved KC-RAPPOR algorithm has higher statistical data availability and lower communication cost.Figure[20]table[7]reference[53]...

Keywords/Search Tags:

Privacy protection, local differential privacy, data lossless compression method, K-means algorithm

PDF Full Text Request

Related items

1	Research On Privacy Protection Method Of Sensor-cloud Based On Local Differential Privacy
2	Research On Data Privacy Protection Method Based On Differential Privacy Mechanism
3	Research On Trajectory Data Protection Method Based On Differential Privacy
4	Research On Differential Privacy Protection Based On Classified Data
5	Research And Application Of Data Clustering Privacy Protection Based On Local Differential Privacy
6	Research On Clustering Optimization Method Supporting Differential Privacy Protection
7	Research On Differential Privacy Protection Based On Clustering
8	Data Privacy Protection Based On Local Differential Privacy
9	Research On Trajectory Data Privacy Protection Approach For Location Based Services
10	Research On K-means++ Clustering Algorithm Based On Laplace Mechanism For Differential Privacy Protection