Font Size: a A A

GPU-based General Processing Model For Data Stream

Posted on:2012-02-05Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y ZhengFull Text:PDF
GTID:2218330368488241Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Data stream is a new data form. Many applications will continuously produce large amounts of sequential data which evolves along with time to form time-series data streams, such as sensor networks, real-time stock quotes, networking and communications monitoring and other occasions. Data mining is a powerful tool to analyze of multiple data streams in parallel. However, data stream is infinite, time-varying, continuous, high-speed and high-dimensional, the characteristics make traditional data mining methods cannot be directly applied. To meet the particularity of data flow, so there came to a new technology-stream data mining, also known as data stream mining.It's difficult to deal with stream data because of its particularity. Stream data mining can indeed handle the data flow, however, there have been unprecedented challenges. The main challenge is "data intensive" mining which is restricted by the limited resources of space (memory) and time. We need to consider the first fundamental question is how to optimize the consumed memory space of mining algorithm. Another problem is how to complete the data processing in the shortest time to meet real-time data stream processing. These two issues present no good solution.In this paper, we focus on applied research of GPU parallel computing in the data stream processing field, especially high-performance handling problem of high-dimensional time series data streams. To ensure real-time and general processing of data stream under the situation of computing resource constraints, in conjunction with GPU parallel computing and CUDA architecture, this paper proposes a GPU-based data stream general processing model. The model is suitable for time-series data streams of various applications, it covers pretreatment, load shedding, synopsis extraction and mining processing of data streams, and can complete mutiple processing tasks, such as query processing, clustering, classification, frequent itemset mining and so on.This paper takes k-means for example, and presents the technology realization of core areas. Finally, it gives the software architecture description of model, including the visual description of UML as the representative and the formal description of ADL as the representative, in this article, we adopt the combination of UML and ADL methods to describe the system architecture. Through theoretical analysis and experimental validation, the model has a better generality and high efficiency, and reduces I/O cost, it can be widely used in the field of data stream mining.
Keywords/Search Tags:Data Stream, GPU Parallel Computing, General Processing Model, Software Architecture, K-means
PDF Full Text Request
Related items