Font Size: a A A

Research On Personal Information De-identification Algorithm For Data Sharing Application

Posted on:2021-05-07Degree:MasterType:Thesis
Country:ChinaCandidate:C LiuFull Text:PDF
GTID:2428330611450316Subject:Computer technology
Abstract/Summary:PDF Full Text Request
At present,with the rapid development of mobile devices and big data industry,the types and scale of data are showing a trend of rapid growth.While data is open and shared,the problem of personal information security protection is becoming more and more obvious.Personal information de-identification is one of the main methods to protect personal information before data release.This method requires cutting off the correlation between data attributes and data subjects,and makes it impossible for attackers to re-identify personal information subjects from published data,so as to protect personal information from disclosure during data release and sharing.It can be seen that the research of personal information de-identification is very important to realize the effective sharing of data and to excavate the value of big data.(1)K-anonymity model analysis based on information theory.Firstly,the attack channel and defense channel under the k-anonymous model are constructed from the angle of attacker and defender respectively by using information theory.Secondly,based on the general theory of security,the ability limit function of attacker and defender is constructed separately.Finally,the experimental analysis of the ability limit function of both attacker and defender is carried out,based on which a secure k interval is obtained,and a quantification method to measure the ability of attacker and defender under the anonymous model is given.(2)Personal information de-identification algorithm.Firstly,the data set attributes are divided into identification attributes,semi-identification attributes and sensitive attributes.Secondly,the identification attributes are processed bypseudonym and suppression techniques to ensure the security of identification attributes;for semi-identification attributes and sensitive attributes,the rotation technology based on random number is used to ensure the need to maintain the relationship between relevant attributes and improve the availability of information while protecting personal information.Finally,the personal information de-identification algorithm is applied to data set for experimental analysis.The proposed algorithm can ensure the longitudinal analysis results of a single attribute of the data that have been de-identified,cut off the connection between semi-identification attributes and sensitive attributes and maintain the correlation of related attributes in the application scene.(3)Personal information de-identification effect detection algorithm.Firstly,based on information theory,the privacy protection degree function and information loss degree function are defined.Secondly,we propose the detection function of de-identification effect,which can take into account both data privacy security measure and data loss measure.By adjusting the weight of privacy protection degree and information loss degree,the function can be flexibly applied to different situations to measure the effect of de-identification technology.Finally,the correctness and usefulness of the functions are illustrated by the experimental analysis of the generalization,suppression technology.
Keywords/Search Tags:Information Theory, General Security Theory, Personal Information, De-identification
PDF Full Text Request
Related items