Font Size: a A A

Research On Improved High Utility Itemset Mining Algorithms

Posted on:2020-05-04Degree:MasterType:Thesis
Country:ChinaCandidate:S L YinFull Text:PDF
GTID:2428330599460561Subject:Engineering
Abstract/Summary:PDF Full Text Request
In recent years,the extensive application of database technology makes companies,governments and scientific organizations have accumulated a large amount of data.How to analyze and understand these data and provide support for future decision-making becomes very meaningful.Data mining is a discipline used to analyze and mine potential values and interesting patterns in data.High utility itemset mining is one of the technologies that can discover the relationship between data in data mining.Utility value represents the profit of some commodity combinations in the field of business services.High utility itemset mining can mine the itemsets with larger utility value in data.Therefore,high utility itemset mining has attracted more attention and research in recent years.Firstly,aiming at the low efficiency of traditional mining algorithms,an new high utility itemset mining algorithm named TreeHUIMiner algorithm,which combines the structure of prefix tree and utility list,is proposed.Most of the traditional high utility itemset mining algorithms have complex pruning strategies,and only a limited number of high utility itemsets can be mined in a given time.In the new algorithm,"prefix tree" is used to guide the mining of candidate high utility itemsets,and "utility list" is used to calculate the utility values of these candidate itemsets.With the utility values,the final high utility itemsets can be obtained.The new algorithm has no complex pruning strategy,so it can mine more high utility itemsets in a given time.Then,aiming at the problem that only a small number of high utility itemsets can be mined by high utility itemset mining algorithm based on bio-heuristic idea,an new high utility itemset mining algorithm based on Improved Particle Swarm optimization named HUIM-MBPSO algorithm,is proposed.The new algorithm changes the way of generating population optimization values in the process of particle swarm optimization.The next generation of population optimization values are selected with a certain probability by roulette selection method in the high utility itemsets of the current generation of population.This change increases the diversity of the population and enables the new algorithm to mine more high utility itemsets.Finally,two groups of experiments are carried out on the two new algorithms.The experimental results of the first two groups show that TreeHUIMiner algorithm is more efficient than the traditional algorithm.Experiments verify the validity of TreeHUIMiner algorithm in mining high utility itemsets.The last two groups of experimental results show that the HUIM-MBPSO algorithm can mine more high utility itemsets in the specified number of iterations than the traditional algorithm based on bio-heuristics.Experiments verify the effectiveness of HUIM-MBPSO algorithm in mining high utility itemsets.
Keywords/Search Tags:Data mining, Association rules, High utility itemsets, Prefix tree, Particle swarm optimization
PDF Full Text Request
Related items