Font Size: a A A

Research On Low-loss Anonymity Algorithm Based On Sensitivity Stratification And New L-diversity

Posted on:2021-03-31Degree:MasterType:Thesis
Country:ChinaCandidate:Y M WangFull Text:PDF
GTID:2428330602464604Subject:Engineering
Abstract/Summary:PDF Full Text Request
The application scope of the Internet is getting wider and wider,permeating every aspect of people's lives.With the popularization of the Internet,the rapid development of data mining technology has been driven,companies and research institutions are also hoping to find more valuable information in big data online.At the same time,more and more people rely on the Internet,however,they often inadvertently put basic personal information(such as postal code,date of birth,sex,etc.)on the Internet.At the same time,the development of data mining has prompted more and more enterprises or organizations to collect data in multiple ways in order to find potentially valuable information,but the published data usually contains personal information,for example,the public release of medical records to study certain diseases.These published datasheets usually contain sensitive attributes of the individual,and if the publisher does not attach importance to the datasheets and does not handle the datasheets anonymously,the security of the individual may be compromised and the privacy of the patient may be compromised.Therefore,all information that can be directly linked to an individual should be deleted before it is made public,and all other non-sensitive attributes should be similarly generalized as necessary to protect against risks such as homogenous and skewed attacks,this leads to the research of anonymous privacy of Datasheets,and more and more experts and scholars introduce new anonymous models with more security and validityData analysis and data mining is an essential part of current scientific research.Many organizations are increasingly collecting and publishing data for data analysis and scientific research,if the proper method is not applied to data processing,it may lead to personal leakage or unnecessary loss of information.In recent years,the privacy protection of individuals has been paid more attention when releasing sensitive information.The existing privacy protection scheme k-anonymity model can avoid identity leakage well,but under background knowledge attack,it doesn't offer enough privacy.The L-DIVERSITY model improves the k-anonymity method to block the homogeneous attack,but there are many problems in the improvement of L-diversity,because of its own defects,it is not very effective to deal with the numerical sensitive attribute,the problem of sensitivity is not considered,which may lead to the situation that there are too many high sensitivity values in the same equivalence class,and it is easy to be attacked by skew type,this leads to the increase of information loss rate and the decrease of data availability,which makes the research results not obvious or even wrong.This paper focuses on the improvement of information availability and security,and the selection of a suitable clustering method to handle the initial data set,so that the tuples in the same equivalence class are as similar as possible,and proposes an 1-sensitivity-level anonymity model,in view of the previous research,this paper improves the method based on clustering 1-diversity based on sensitivity level division,analyzes the reasons of attribute leakage,fully considers the semantics of sensitive attributes,and gives a more clear definition of privacy protection.First,the sensitive attribute values are divided into different levels according to the sensitivity level,and then the records are grouped according to the sensitive attribute.The experimental results show that the scheme is feasible.On the basis of higher privacy protection,the data generalization degree is minimum and the data generalization degree is lower,which further improves the data availability.
Keywords/Search Tags:l-diversity, k-anonymity, degree of sensitivity, information loss, clustering
PDF Full Text Request
Related items