Research On Average High Utility Itemsets Mining Algorithm

Posted on:2014-01-25

Degree:Master

Type:Thesis

Country:China

Candidate:M Jiang

Full Text:PDF

GTID:2248330398950004

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Association rules mining which provides users with rules according to internal correlations of data, is the most active field of data mining. Users can use these rules to make decisions, prediction and do other operations in various fields, such as commercial activity, science research, bioinformatics and many other fields.The traditional association rules mining aims at discovering the frequent itemsets, only considering the occurrence frequencies of items. Then rules can be generated with the frequent itemsets. Howerver, the importance of distinct item is not taken into account. Thus, some infrequent but useful itemsets may be not discovered. To resolve this problem, utility based association rules mining was proposed. Utility was defined to measure the importance of distinct items which can present the preference and interest of users.Traditionally, the utility of an itemset will increase along with the increment of the length of the itemset. To eliminate the effect of the length, average utility defined as the total utility of an itemset divided by length of this itemset was proposed. The present methods for mining average high utility itemsets need scan the dataset many times and generate large number of candidate itemsets which cost much time and space. This paper aims to improve the efficiency of mining average high utility itemsets. The main works include:The merits and drawbacks of the current average high utility itemsets mining algorithms are analyzed and a new algorithm is proposed, called HAUI-Mine. HAUI-Mine scans dataset twice to construct the HAUI-Tree and generates no candidate itemsets. During the mining process, the condition pattern trees are built recursively to generate average high utility itemsets. The results of experiment show that with the condition of dense dataset or lower threshold, the HAUI-Mine outperforms HAUP-Mine evidently.An algorithm called ITR-Mine for mining average high utility itemsets from data stream are proposed. Algorithms of mining itemsets on traditional transactional datasets cannot be applied on data stream directly because of the character of data stream. ITR-Mine combining the sliding window forms a method which scans the data only once and generates no candidate itemsets. The construction of ITR-Tree can be easily and efficiently modified when window slides without reconstructing the ITR-Tree completely.

Keywords/Search Tags:

Data Mining, Association Rules, Average High Utility Itemsets

PDF Full Text Request

Related items

1	Research On High Average-utility Itemsets Mining Algorithm
2	Research On High Average-utility Itemsets Mining Algorithm
3	High-utility Association Rule Mining
4	An Efficient Algorithm For Discovering High Utility Itemsets With Negative Item Values In Large Databases
5	Research On Improved High Utility Itemset Mining Algorithms
6	Research On High Utility Itemsets Mining Algorithm
7	Research Of High Utility Itemsets Mining Methods
8	Research On Frequent And High Utility Itemset Mining Algorithms In Association Rule Mining
9	Research On Frequent And High Utility Itemsets Miningalgorithm With Multiple Minimum Utility Thresholds
10	Research On Segmentation And High - Efficiency Itemsets For Data Flow