Font Size: a A A

Research On Key Technologies Of Hierarchical Privacy Preservation In Data Collection And Publication

Posted on:2022-10-13Degree:DoctorType:Dissertation
Country:ChinaCandidate:H N SongFull Text:PDF
GTID:1488306350488904Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the rapid developments of the Internet and artificial intelligence technology,the data value has received unprecedented attention.As a result,a large amount of data has begun to be collected,published and used for scientific research and decision-making analysis.However,in recent years,the frequent occurrence of privacy leakage events has led to more and more users expressing concern about the privacyof personal information,and even refusing to provide real individual data,and opposing the opening and sharing of data.Therefore,it is very important to ensure that their sensitive personal information will not be disclosed indata collection and publication.On the other hand,due to the inherent characteristic of private data,the personalized privacy requirements of different private data are prominent.However,the existing one-size-fits-all privacy preservation schemes more or less have the defects of insufficient or excessive protection for individual private data.In view of the above problems,this paper takes full consideration to the idea of hierarchical privacy protection in data collection and publication while realizing personalized privacy protection and trying its best to improve data availability.This can accelerate the opening and sharing of big data,and then promote the healthy and rapid development of the data industry.The contributions of this paper are as follows:(1)In anonymous data publication,the personalized enhanced psensitivity k-anonymous models based on sensitivity level is proposed to deal with the skewness attacks as well as the sensitivity attacks,and two greedy clustering algorithms are designed to improve the utility of anonymous data.Firstly,this paper takes full consideration to the personalized privacy requirements of both different sensitive values and different identical sensitive groups,and then sets personalized frequency constraints and personalized diversity constraints to avoid the "one size fits all" type parameter setting.This greatly reduces unnecessary loss of information to improve the utility of anonymous data as possible,while realizing personalized privacy protection.Secondly,the sensitivity of different sensitive values is measured based on selfinformation,and the idea of hierarchical clustering is introduced to adaptively perform the sensitive levels partition,so that the quantitative partition result is very practical and no longer different from person to person.Then,according to the enhanced model,this paper designs two greedy clustering algorithms,that is,global and local greedy clustering algorithms,which can effectively reduce the loss of information during data anonymization.Finally,the simulation results show that the proposed personalized enhanced anonymous models can provide better privacy preservation at the cost of a small amount of data utility and time overhead,especially enhancing their ability to resist the skewness attacks and sensitivity attacks.(2)A personalized randomized response(PRR)mechanism was proposed according to the personalized privacy requirements of different private data during local data collection.The proposed PRR technique introduces the the concept of weights for different sensitive values,and then designs the personalized randomized perturbation mechanism according to the corresponding weight,ensuring that different sensitive values with different privacy requirements can reach their corresponding expected privacy protection degree,while realizing personalized privacy preservation and improving data utility.Then,the objective and subjective privacy leakage degree of the mechanism is analyzed,and the estimation error of privacy distribution is also deduced.Theoretical analysis and numerical results show that,compared with the traditional randomized response model,the proposed PRR mechanism can achieve higher accuracy of statistics while effectively realizing personalized privacy preservation for a certain same subjective privacy leakage degree.(3)In data collection and data publishing,integrating the ciphertext-policy attribute-based encryption(CP-ABE)with local differential privacy(LDP)based on randomized response(RR),a multi-level privacy preservation data collection and sharing scheme with anti-collusion characteristics is proposed with a relatively higher data utility and lower complexity,which provides data owners with double privacy protection.One is,the LDP technology based on RR solves the problem of privacy leakage from the source,providing the first line of privacy protection at the local side during data cllection.The other one is,in the access control based on CP-ABE,the data owner designs hierarchical access control policy to enhance its control over different private data during data sharing,realizing the second line of defense.In hierarchical privacy preservation and data sharing,a random perturbation strategy for resisting collusion attacks(RCA)based on RR is elaborately designed to defend against collusion attacks,which has been proved theoretically that the RCA strategy can ensure that when users with different trust levels carry out collusion attacks,they can only obtain the same information as the user with the highest trust level.Theoretical and simulation results show that,compared with the existing scheme,the proposed integration scheme can reduce the complexity by at least 50%while effectively improving the accuracy of statistics.
Keywords/Search Tags:Hierarchical privacy preservation, Anonymous data publication, Randomized response, Local differential privacy, Access control based on CP-ABE
PDF Full Text Request
Related items