Font Size: a A A

A Distributed Data Stream Mining Algorithm For Privacy-preserving

Posted on:2011-02-20Degree:MasterType:Thesis
Country:ChinaCandidate:Y WangFull Text:PDF
GTID:2178330332960281Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of computer network technology, large-scale distributed computing and data sharing, distributed data streams which need to be analyzed and processed in practical application such as financial risk analysis, wireless sensor networks, network intrusion detection and so on, are more and more popular. Knowledge discovery in distributed data streams becomes a hot research field. However, data mining also pose a threat to privacy and information security when acquire the knowledge. Therefore, how to preserve the privacy in the process of efficient data mining, poses a new challenge to distributed data mining.Aiming at doing away with the privacy disclosure in current distributed data stream mining applications, the thesis has done research in the data stream mining and privacy protection, combined with existing data mining techniques and models.Firstly, a distributed data stream mining model for privacy-preserving is defined in this paper. it showed major concerns to the recent data and proposeed. A security transmission policy in structure of interconnected communication among remote primary site, main site and coordinated site. The transmission policy can protect privacy as well as satisfied with the characteristics of distributed data stream which is real-time and high-speed. Then, we proposed a distributed data stream mining algorithm for privacy-preserving, using random perturbation technology and centralized data stream mining algorithms, to find the closed frequent itemset which can completely represent frequent itemset and smaller.This method can protect the original sensitive data effectively, and closed frequent itemset and its subset can be updated incrementally to the main site by high security encryption protocol, which will reduce the traffic load, and achieve the double privacy-preserving to the original data and the local rules effectively.Finally, the algorithm is analyzed through simulation experiments. The results show that the algorithm is feasible and effective. It can adapt to the dynamic and distributed characteristic of data streams. It also can achieve better privacy protection effect, and at the same time, reduce the traffic load effectively.
Keywords/Search Tags:Distributed Data Stream, Double Privacy-Preserving, Closed Frequent itemset, Security Transmission Policy, Random Perturbation Technology
PDF Full Text Request
Related items