Research On Association Rules Mining Precision Over Data Stream

Posted on:2012-02-20

Degree:Master

Type:Thesis

Country:China

Candidate:G H Hao

Full Text:PDF

GTID:2218330338465397

Subject:E-commerce and information technology

Abstract/Summary:

Nowadays, at the digital age, with the development of telecommunication and World Wide Web, the volume of data is increasing extremely. The data stream comes up. Discovering the useful information and knowledge in the data, just like mining the precious in the huge ocean, is a challenge that we face up to. Mining the frequent patterns in the data stream is a new task in recent years, it is meaningful to the social production and our daily life, it can be widely used in telecommunication, facilities maintenance, security exchange and etc.The data mining works and researchers make great effort on the data stream mininig and advance a lot of new design on the mininig procedures and algorithms. However most of the researches only put the attention on the mining process, lack of the work on the mining result. The aim of the data mining is the precise, credible and useful information, and as the reverse of our expect, the result of association rules mining on the data stream can only be approximative. So the precision of the mining result should be the parameter key of the association rules mininig.In the essay, it illustrate a new method on mining frequent patterns in data stream and algorithms ensuring the precise result. It modify the details of the mining method on the obtaining data, data storage and information discovering. Our research consist of three part:obtaining data, data storage and knowledge discovering, it works for ensuring the precision of the mining result on every method detailsFirstly the sliding time windows divide the data stream to itemsets, drop the items which appears more than twice, then sort the itemset as the sequence of the first layer child node in the FP-Atree from the left to right. After that the itemset can be seen as transactions.Secondly the data storage consist of storage structure, data update algorithm and computing the maximum error. Our researcher advanced a new data storage structure named FP-Atree, different the FP-tree, it consist of a prefix tree, without the head-node list and the head-node point. The data update algorithm divide the whole time to time frames, after each frame, the node which support less than the maximum error should be pruned.Finally it proposes the polynomial strategy to estimate the value of the maximum error, and in Chapter 4 the minimum support threshold has been modified. The proper value of the maximum error and modified minimum support threshold are two parameter keys for increasing the precision of the mining result.

Keywords/Search Tags:

Frequent Patterns, Maximum Error, Minimum Support Threshold, Mining Precision

Related items

1	Study On Frequent Pattern Mining Algorithms And Pruning Strategies
2	The Study Of Mining Algorithm Based On Weighted Multiple Minimum Supports
3	Research On Key Techniques Of Negative Frequent Patterns Mining Based On Multiple Minimum Supports
4	Research Frequent Pattern Mining Algorithm Based On Compact Pattern Tree And Multiple Minimum Support
5	The Techniques Research On Frequent Pattern Mining
6	Research On Mining And Querying Frequent Patterns Based On Simplified Frequent Pattern Tree
7	The Techniques Research On Frequent Pattern Mining
8	Research On Frequent-pattern Mining Technology And Its Application On Revenue Assurance Systems
9	Research On Techniques Of Mining Frequent XML Patterns
10	Research On Mining And Dynamic Maintenance Of Frequent Patterns