Font Size: a A A

Study On Privacy Protection Model

Posted on:2010-12-21Degree:MasterType:Thesis
Country:ChinaCandidate:X L ShiFull Text:PDF
GTID:2178360278962396Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the development of information technology, the concept of privacy has changed a lot, and individuals pay more and more attention to the protection of privacy. On the other hand, the information is growing so fast that people are immersed in rich data with poor knowledge. Data mining technology is able to discover the potential relationship behind the large amount of data. However, the application of data mining technology increases the risk of private information. Data should be anonymised before it's published in order to avoid the leakage. This paper studies the methods from a technical point of view. The main tasks of the paper are listed below:First, the paper studies existing technical means to protect the privacy, especially techniques used before data is published, and make a conclusion of the two major hot spots on this area, the anonymous proposal and the anonymous methods. Then the current proposals and methods are studied. Via comparative analysis, the paper makes a summary of the advantages and disadvantages of the proposals and methods.Followed by analysis of the existing attacks against published data and the effective measures to protect the privacy, the paper goes deep into the procedure of background knowledge attack and its important premises. Meanwhile, k-Anonymity and l-Diversity privacy proposals are studied in depth. Aware of the inadequacies of l-Diversity against the background knowledge attack, the paper improves it to provide better protection for privacy with novel requests of the distribution of sensitive attribute values, and proposes (a,d)-Diversity privacy protection model.And then, the paper studies the existing anonymous methods, analyzes the advantages and disadvantages of them. Combined with the generalization and clustering, the paper implements (a,d)-Diversity model, and describes the procedure and the algorithm to achieve the published data satisfying (a,d)-Diversity principle. In the realization of the process, the practical application is taken into consideration. Since different attributes have different importance, they are treated differently. Therefore, the published data remains more effective and more practical.To evaluate the published data, including the information loss and the privacy disclosure risk evaluation. The existing models for evaluating information loss are studied, and then improved. In addition, since the risk of disclosure is related with the distribution of sensitive attribute values in equivalence class to a certain extent, the paper proposes a privacy disclosure risk model for evaluating the security level of the published data.Finally, the algorithm is implemented, and with Adult database from machine learning center of University of California, the paper experiments on the time cost, the information loss and the privacy disclosure risk of the results produced by (a,d)-Diversity model. The experimental results suggest that with the increase of information loss with certain threshold, the privacy is protected more effective.
Keywords/Search Tags:Privacy Protection, Data Mining, Anonymity, Background Knowledge Attack
PDF Full Text Request
Related items