Font Size: a A A

Research And Technology In Data Mining Based On Privacy Protection

Posted on:2015-01-10Degree:MasterType:Thesis
Country:ChinaCandidate:E J SunFull Text:PDF
GTID:2268330428461925Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development and application of computer technology, data miningemerge as the times require. Most of the traditional data mining on the original data,while discovery of knowledge, many privacy of sensitive information inevitably beviolated. A number of studies have shown that people generally very concern for theabuse of private data, and that a lot of countries and regions are provided to provideprotection for private data, so, for the social and legal pressure, privacy mechanismsleaked must be carried out to prevent while data mining.The purpose of the data mining based on privacy protection, is to extractpreviously unknown in the database, while valuable information and knowledge,without exposing private information (or at least some of the sensitive information).The main contents are as follows:While Maximum Weight Matching Clustering Algorithm and Weak Clusteringbased Privacy Protection Framework be used in DNALA, which privacy protectionalgorithm already exist in DNA sequence data set by reasonable improvement, privacyprotection algorithm proposed a new algorithm DNALA-IA, see as follows:(1) For the low time efficiency in sequence alignment process of DNALA,change the original multiple sequence alignment methods, use DNALA-DMAalgorithm, which using two double sequence alignment methods to calculate thedistance matrix, shorten the time required while guarantee the accuracy of the finalresult, and then use DNALA-DMA both to protect the privacy and reduce the loss ofthe information.(2) For the low accuracy of the DNALA results and the DNALA clusteringalgorithm results can not be updated in real time, using DNALA-CA algorithm, theDNALA-CA algorithm of using the maximum weight matching clustering algorithm (MWMCA), both improved the accuracy of clustering results, and the timecomplexity unchanged. The DNALA-CA algorithm of data stream based on weakclustering based privacy protection framework(WCPPF), the WCPPF algorithm isdivided into online algorithm and offline algorithm, while the original sequencechanges, you can use the online algorithm, fast changing clustering results, dynamicto maintenance the changes of the data stream.In this paper, The DNALA-DMA and DNALA-CA algorithm to form DNALA-IA algorithm, to select the appropriate algorithm in real-time by frequency to obtainthe best clustering results, mining knowledge nuggets!...
Keywords/Search Tags:Privacy Protection, Data Mining, Cluster, Maximum Weight Matching
PDF Full Text Request
Related items