Font Size: a A A

Research On Frequent And High Utility Itemsets Miningalgorithm With Multiple Minimum Utility Thresholds

Posted on:2019-10-28Degree:MasterType:Thesis
Country:ChinaCandidate:R R LvFull Text:PDF
GTID:2428330572469121Subject:Computer technology
Abstract/Summary:PDF Full Text Request
High utility itemsets mining has been one of the hot issues in the field of data mining in recent years.It solves the problem that the traditional frequent itemsets mining only considers the occurrence frequency of itemsets without considering the number of items and unit utility.However,most high utility itemsets mining algorithms use a single utility threshold constraint without considering the differences between items,which is inappropriate and unfair in real life.Therefore,the high utility itemsets mining with multi-minimum utility thresholds emerges as the times require.This paper mainly aims to overcome the shortcomings of multi-minimum utility threshold mining algorithms.In addition,it is well known that frequent itemsets mining and high utility itemsets mining are of great significance in the field of data mining.However,most existing research algorithms either use support constraints to mine frequent itemsets or use utility constraints to mine high utility itemsets.But considering these two constraints separately has its own limitations.For example,high-support itemsets may not be high-utility,and high-utility itemsets may not be high-support.This paper mainly does the following work for the problems in the above two aspects:(1)It describes the research status of frequent itemsets mining and high utility itemsets mining,analyzes the existing frequent itemsets mining algorithms and high utility itemsets mining algorithms,and summarizes their advantages and disadvantages.(2)It analyzes the existing multi-minimum utility threshold mining algorithms.In the mining algorithm for high utility itemsets with multiple minimum utility threshold(MHUI),calculation is often repeated and mining result itemsets are not frequent.This paper develops two new fast mining algorithms SFMHUI and FMHUI.The FMHUI algorithm uses the previous calculation result in the calculation of the minimum utility threshold of the itemsets,avoiding duplicate comparisons between items;in addition,the FMHUI algorithm defines the minimum utility threshold table EMMU-table of extensions of items to quickly calculate the minimum utility threshold of extensions,improving the efficiency.The SFMHUI algorithm adds the support constraints on the basis of the FMHUI algorithm,making the mining itemsets both high-utility and frequent.These two algorithms combine 4 pruning properties to prune the search space and improve the mining efficiency.Finally,the simulation results show that the FMHUI algorithm has higher mining efficiency than the latest mining algorithm(MHUI algorithm)with multi-minimum utility threshold,and verifies the efficiency and feasibility of the SFMHU algorithm.(3)In real life,many items need to be classified,and the existing high utility itemsets mining algorithm is not suitable for mining high utility items from such databases.It is inappropriate and unfair to adopt a single utility threshold mining algorithm without considering the differences between items.However,it is obviously inappropriate to assign a minimum utility threshold to each item by using the multi-minimum utility threshold mining algorithm when there are many items in the database but few classes of items.To solve this problem,a multi-minimum utility threshold mining algorithm based on similar items(CMFHUI)is proposed in this paper.The CMFHUI algorithm mines high utility itemsets by assigning a minimum utility threshold to each class,and adds support constraints on this basis,which further makes the mined itemsets both high-utility and frequent.Then an improved algorithm CMFHUI+ is proposed to further improve the mining efficiency.Finally,the effectiveness and feasibility of the two algorithms are validated by simulation in a public database.In general,this paper combines the theories of frequent itemsets mining and high utility itemsets mining with multiple minimum utility thresholds,and proposes an improved algorithm,which is verified by simulation experiments.
Keywords/Search Tags:data mining, high utility itemsets mining, frequent itemsets mining, multiple minimum utility thresholds, frequent and high utility itemsets, algorithm
PDF Full Text Request
Related items