Font Size: a A A

Research On Privacy Preserving Methods In Data Sharing And Publishing

Posted on:2020-06-11Degree:MasterType:Thesis
Country:ChinaCandidate:S C HuangFull Text:PDF
GTID:2428330590996454Subject:Information security
Abstract/Summary:PDF Full Text Request
The 21st century is an era of highly developed information technology.Ordinary people's daily life is not only inundated with data,those data are also widely used invarious industries,providing a continuous source of power for the efficient operation and development of industries and our society.Data Sharing provides possibilities for data flowing and also improves the utilization level of data.In the process of Data Sharing,there're risks of revealing users' privacy if people publish these data directly.Therefore,implementing data sharing efficiently and safely has always been a topic withpeople's great concern,and it is also a difficult task.In the past ten years,researchers have done a lot of researches on privacy protection in the field of data sharing and data distribution.Data distribution processing method is traditional anonymized privacy protection technology.However,there are still risks of reavealing privacies in current solutions.In order to solve this problem,this thesis proposes a coding scheme based on differential privacy to meet the requirements of data publishing applications.The corresponding theories and experiments prove that the proposed scheme in this thesis has further improved users' privacy protectionlevel.This thesis is divided into three aspects: 1)introduction and improvement of bit vector coding scheme,2)record linkage grouping based on bit vector,and 3)histogram publishing based on bit vector.Among them,the introduction and improvement of the bit vector coding scheme firstly introduces the bit vector coding scheme in details,and then proposes an Improved Bit Vectors coding scheme(IBV),which makes it more related to Bit Vectors coding scheme(BV).Moreover,focusing on the privacy problem caused by random number leakage in IBV scheme,bit vector coding scheme based on differential privacy is proposed in this thesis.The validity experiment of distance estimation proves that the proposed IBV scheme's worst error is lower than BV.About the problem of recording link grouping based on bit vector,the thesis studied the performance of BV scheme and IBV scheme on record link problem.Aboutthe efficiency problem of accelerating record link in practical application,based on the idea of binary tree,a data grouping scheme is proposed,which can effectively improve the efficiency of record link without reducing the effect of record link.After recording link age experiments,the improved bit vector technique has higher accuracy,recalling rate,and fscore under the same correction factor.The advantages of the grouping scheme in terms of efficiency and accuracy were verified by group experiments.Finally,about the publishing of bit-vector-based histogram,this thesis applies the BV scheme to the scenario of histogram publishing and mean publishing,so as to meet the publishing requirements of protecting users' privacies.Moreover,the differential privacybased coding scheme proposedan algorithm for data distribution in anonymous space.Through the experiement,we obtained following conclusions: In the histogram release of BV and IBV coding schemes,the coding error of IBV scheme is smaller under different data quantities;the larger the data volume is,the more accurate the histogram estimation based on the differential privacybased bit vector coding scheme is.The larger the privacy parameter epsilon is,the smaller the estimation error of the mean estimation and the histogram estimation is.
Keywords/Search Tags:privacy protection, differential privacy, histogram publishing, record linkage
PDF Full Text Request
Related items