Font Size: a A A

Research On Privacy Protection Scheme Based On Random Perturbation

Posted on:2016-10-05Degree:MasterType:Thesis
Country:ChinaCandidate:L ZuoFull Text:PDF
GTID:2308330482953277Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
With the development of big data technology, privacy protection has become an important problem of data mining. Therefore, the leakage of private data must be considered and prevented during the data mining process. Traditional data mining techniques assume that the data set can be acquired directly. However, this assumption is often inconsistent with the characteristics of private data. In fact, due to the necessary of privacy protection, all data obtained is released data after dealing with the privacy of data. Based on those consideration, it is necessary to work in data mining in the condition of unknown the exact value of the data. The centralized data privacy protection was researched in this paper. In the thesis, the work we have done is as follows:1. The common PPDM algorithms was analyzed and summarized in this paper.From data collection, data mining and privacy protection methods, it classifies existing data mining privacy protection techniques.At the same time, it detailed expounds the realization process of all kinds of methods. Finally, based on the effectiveness, complexity and extensibility of the algorithm, we analyzed the advantages and disadvantages of various kinds of privacy protection algorithm.2. The data mining privacy protection scheme based on DDPD was studied in this paper.In view of the data matching method classification problem, we proposed relevance ranking algorithm based on sample important attribute. The program ensures the consistency on the sample correlation between the release of data sets and the original data set.In the process of implementation of the algorithm, using a combination of two rules covering algorithm order growth reduces the generalization error rules; An analysis of the attribute set degree of influence on data mining extracts the important role of classification attributes; At the same time, the definition of a weighted correlation coefficient to measure samples association improves the accuracy of similarity detection. In the final analysis, we will plan the classification problem has been generalized to other data mining applications.3. In view of the naive bayesian classification problem, a privacy protection scheme based on LRDP was proposed in this paper. The differences between TRDP is that the proposed scheme when privacy protection is not of all of the data hiding, but according to the parameters and distribution of a given some of the hidden data. Based on the nominal attributes and continuous attributes, two kinds of different parts of the hidden algorithm was designed in this paper:random mapping algorithm and random selection algorithm of numerical data. As the researches show:Under the premise of selecting the appropriate parameters, the program can achieve the purpose of protection of privacy while maintaining high reliability.
Keywords/Search Tags:Important Properties, Correlation, Privacy Protection, Naive Bayes Classifier
PDF Full Text Request
Related items