Font Size: a A A

Research On Frequent Episode Mining Over Event Streams With Intervals

Posted on:2011-12-25Degree:MasterType:Thesis
Country:ChinaCandidate:C WangFull Text:PDF
GTID:2248330395957875Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of the information era, people need to deal with more and more data. Large quantities of rules and knowledge hiding among the data need discovering. Data mining emerged as the times require. Although the data mining technology has been very mature in recent years, there have been a lot of event stream data with EDGE such as RFID and Sensor widely used. Most of traditional data mining techniques mine the static data and they do not apply to this event stream data.Frequent episode mining is one of the most important aspects in data mining. Recently the frequent episodes mining study is focused on static event sequences and point events. But in real cases, event stream processing is often more meaningful than static data, and many events have the time intervals. We can not process these data according to the way of the point event. Therefore, for these problems, the thesis proposes some algorithms which can mine frequent episodes over event streams with time intervals, and effectively handle event streams and interval events at the same time.Firstly, to the interval events, a tuple-based representation method of the relationship among events is proposed. It can effectively distinguish all frequent episodes in the sliding window and avoid the missing of the episode, which occurs in traditional mining. However, this method still has some problems, so this thesis proposes a matrix-based expression method, which allows users to clearly understand the internal relationship of the mined frequent episodes.Secondly, the events processed in our daily life are often the ones with intervals, which need to continue for some time before the end of events. Therefore, the thesis proposes an algorithm that combines Relation_Thread_Tree with B+index, which can store all of episodes in the sliding window and avoid scanning data many times.Thirdly, the thesis proposes a Linear_Linklist_Depth_First algorithm, which adopts depth-first to construct the linear linklist, and uses the minimal error based pruning method to filter episodes in the sliding window, saving the consumption of time and space.Experiment results show that the proposed algorithm can quickly process newly coming events, and mine all frequent episodes in the sliding window.
Keywords/Search Tags:Event stream, interval event, frequent episode mining, sliding window
PDF Full Text Request
Related items