Font Size: a A A

Multiple Data Stream Relation Analysis And Pattern Discovery With PCA/ICA

Posted on:2010-11-06Degree:MasterType:Thesis
Country:ChinaCandidate:J J LuoFull Text:PDF
GTID:2178360302460947Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In recent years, it improves a lot to gather and transport information due to the development of IT applications in many fields. Many applications generate a large number of streaming data which include: network-traffic monitoring, computer network security, financial applications and environmental monitoring and many more. Analyzing and mining such kind of data has increasingly become hot issues. Data stream mining is to extract useful information and knowledge which embed in the streaming data and previously unknown to users.The major contribution of this paper is to find a way to compact data streams, to analysis the correlations of streams and even to separate the core factors that influent the streams' trend with the help of PCA and ICA which were developed for reducing dimensions of data with high dimension complexity and separating signals from mixed data in 1990's. At the same time, some unique characteristics of streams should be considered in the issue.This paper mainly introduces a new model based on principal component analysis, independent component analysis and multiple data stream model which supplies a new method of multiple data stream correlations analysis and pattern discovery. The principal components are orthogonal from each other, so we develop a method which takes the cosine of each pair of vectors in the space whose axis is constructed by those principal components. PCA reduces the dimension of data streams and make them white. Based on that processing, we can separate the independent components.As PCA/ICA can separate independent components from complicated information, a solution can be implemented with the help of PCA /ICA on Multiple Data Stream correlation analysis, pattern discovery and hidden variables. The independent components and hidden variables are useful data such as voice mixture experiment. At the other hand, those hidden variables are highly abstracted which supply a way to predict the streams trend and discover deep patterns. The robustness and real-time performance are also discussed in the experiment.
Keywords/Search Tags:Data Stream, Principal Component Analysis, Independent Component Analysis, Multiple Data Stream Relations
PDF Full Text Request
Related items