Font Size: a A A

Research On Mining Algorithms Of Top-K Closed Frequent Itemsets Based On Datastream

Posted on:2011-05-17Degree:MasterType:Thesis
Country:ChinaCandidate:L S LiFull Text:PDF
GTID:2178360305977425Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of data streams, such as network monitoring in real time, log records, clickstreams and the call detail records in telecommunications. In the above dynamically enviroment they generate the giant, continuous, changing in every time and bounded streamdatas. So a great deal of attention has been concerned on the stream data mining in the data mining. For the in depth analysis, datamining, and the interesting patterns, trend and outlier in the bounded stream data, researchers have already had a detailed analysis in every aspect of stream data.On account of the interesting association rules which are generate by frequent items, frequent items and closed frequent items are gradually paied great attentions. According the continuous arrival of the data stream, in the mechanism of damped window this paper analyzed the algorithm the researchers presented, in the furthermore, the approximate frequent closed items are also given by the algorithm. Experimental studies show that the algorithm is an efficient, single-pass for online mining of the set of top-k closed itemset over stream damped sliding window. In this paper there are three contents in the follow.1. The paper analyzed the applications in dataminings of data streams and the classic algorithms Moment and FP-stream in mining frequent itemsets in datastream. Specially, the TKC-DS algorithm which is used to mine the top-k closed itemset designed by Hua-Fu Li. As a result, we clearly understand the mining properties in the data stream.2. Using sliding window mechanism, namely the sliding window (SW), which is divided into basic window. And give a damped factor for each basic window, with the damped factor in sliding window the algorithm for mining frequent closed itemsets can obtain accurate results. The proposed way to update the support counting, makes the minimum support count in circumstances of the data stream with the incremental update, users do not have to set the support of error. Thus it avoids the user to set the minimum support threshold randomly.3. Using the improved method of updating windows, support and bit vectors to represent itemsets. Assigning weights to the signal item, the improved algorithm proposed for candidate itemsets, which is made by Bd-(X) and the incomplete subset of subset (Td) which has already in the HTC.In this paper, based on previous studies, the improved algorithm for mining frequent closed itemsets Top-k-FCI, this algorithm used the damped basic sliding window mechanism, support incremental update method for real-time pruning and use of the candidate itemsets,the way of mining in the accuracy and similar results has been greatly improved.It is a default work to get the accuracy and the precision of the results in the damped window. So we can get the top-k closed itemsets. The work in this paper can prove the widespread application in many applications.
Keywords/Search Tags:data mining, datastream, closed frequent itemset, top-K
PDF Full Text Request
Related items