Font Size: a A A

High-utility Episodes Mining Over Event Sequences

Posted on:2012-01-10Degree:MasterType:Thesis
Country:ChinaCandidate:T Z GuoFull Text:PDF
GTID:2268330425490461Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the development of the information era, people need to deal with more and more data. Large quantities of information and knowledge which people need hide among the data. People need to transfer it into useful information and knowledge. Data mining emerged as the times require. Although the data mining technology has been very mature in recent years, there have been a lot of event stream data with EDGE (Electronic Data Gather Equipment) such as RFID (Radio Frequency Identification) and Sensor widely used. Most of traditional data mining techniques mine the static data and they do not apply to event stream data. Data mining in event sequences is being applied widely.Frequent episode mining is one of the most important aspects in data mining. However, the existing frequent episode discovery approaches consider equal significance values of distinct episodes. But in real cases, different episodes have different weights. Therefore, they are not applicable to actually represent many real-world scenarios. Traditional mining approaches can not be used to mine high-utility episodes in event sequences. Therefor, this paper proposes mining approaches for mining high-utility episodes in event sequences.Firstly, this paper proposes a model for measuring episode’s utility. Episode’s utility is defined as the episode’s support multiplied by its weight. The model avoids the shortcoming of mining episodes with occurrencing times. In this model, this paper can mine some episodes which are more practical in some scenarios.Secondly, because this paper considers the episode’s utility, the most challenging problem for high-utility episode mining is that episode’s utility does not have the downward closure property. So, this paper proposes two prune strategies which are used in the mining process to reduce the search space.Thirdly, this paper proposes a mining algorithm based on prefix projection. Experiment results show that the proposed algorithm can quickly mine high-utility episodes in current window.
Keywords/Search Tags:Event sequence, weight of episode, high-utility episode, time window, pruningstrategy
PDF Full Text Request
Related items