Font Size: a A A

Study On Privacy-preservation Algorithm For Data Publishing

Posted on:2010-08-07Degree:MasterType:Thesis
Country:ChinaCandidate:J G LuFull Text:PDF
GTID:2178360275974443Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the rapid development of computers' processing ability, storage technology and network technology, the extent of electronic information are largely improved. The high degree of information sharing brings people great convenience. But, meanwhile, the phenomena of personal privacy information leakage it brings are also of common occurrence. Especially the personal privacy information leakage caused by mining data in different organizations causes people's panic for the information-sharing, then unwilling to provide the information of their own. Although the data release mechanisms usually take some technical means to hide the user-sensitive attributes, yet through a number of open connections between data sources often results in unexpected operation of the privacy information leakage problem. The goal of protecting the privacy of information-sharing is to consider how to realize an effective information-sharing by ensuring the sensitive private information not to be leaked. In recent years, research in this area has become an important direction of database security. This paper generally analyzes different kinds of anonymous models and technologies, and then points out the security vulnerabilities of the K-anonymity model and the L-diversity as well as the shortages of the commonly used anonymous technology. And this paper comes up with a new data dissemination algorithm which can effectively deal with the current lack of data dissemination algorithm. Details are as follows:According to the problem that current data release privacy protection technology in the process of anonymilizing will causes the loss of too much data information, and of that some current algorithms using packet-switching technology exist privacy information leakage risks because of the use of the anonymous model of security flaws, this paper comes up with a new data dissemination algorithm based on a lossy linkage and using T-closeness anonymous model. This algorithm first produces equivalent groups in accordance with T-closeness anonymous model, then uses exchanging technology to produce user-oriented data dissemination.According to the problem that the current generalization as well as other technologies used to produce T-closeness anonymous equivalence group have the shortage of computational complexity, poor accuracy and loss of generalality, this paper uses the genetic algorithm to generate an optimum combination of the sensitive attribute value in accordance with the characteristics of that the of the T-closeness equivalent group must be consistent with the whole sensitive attribute. And while the equivalent group is produced, it comprehensively examines algorithm's executing time and data security and uses more flexible executing strategy.The experimental results show that compared with traditional methods, this algorithm can effectively resist the attacks of links, background knowledge attacks, attribute disclosure attacks and so on. As a result of publishing real data, so it can retain more data information. Not only that, the result of actual connection query is more close to the actual value.The efficiency of the algorithm, due to the use of improved genetic algorithm, which also controls the execution time within reasonable limits.
Keywords/Search Tags:data release, T-closeness anonymity, information loss, privacy protection, genetic algorithm
PDF Full Text Request
Related items