Font Size: a A A

A Study Of Algorithm For Mining Frequent Itemset Based On Profit Constraint

Posted on:2020-04-20Degree:MasterType:Thesis
Country:ChinaCandidate:L F WuFull Text:PDF
GTID:2428330623463604Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Along with the development of database technology and computer technics,they have used in various domains.Mass of data is generated in enterprise resource planning system which is relevant with enterprise management and operation.The scale of generate data is much more than that people could process it directly.In this background,concept of data mining is raised.Association mining is important branch of data mining,its purpose is to figure out he association between items and pattern satisfying certain criteria behind massive data.Since the association rule mining concept is raised,this technic is well developed.The classical algorithm Apriori is well studied and many improved algorithms are developed.This technic has been applied in many domains.The profit of product is one of the key performance indicators that enterprises pay close attention to.While the classical association rule mining algorithm Apriori and its improved algorithm mainly concentrate on mining boolean association rules.During the mining process,these algorithms focus on the frequency of items,not considering the profit and quantity of items in transactions.On the one hand,there will be a lot of associations minded out,it is hard for uses to find out actionable knowledge.On the other hand,the association mined out based on support framework may not be the high profit association user interested in.In this thesis,it proves that the down closure property is not applicable for profit constraint itemset mining.So,the Apriori algorithm and its improved algorithms have certain drack backs for mining itemset based on profit constraint.In this thesis,relevant concepts are defined for task to mining frequent itemset based on profit constraint.In addition,the evaluation method and measure of mined frequent itemset is designed and validated.A pruning theory,expected transaction count,has been defined and proved after closed study of character of profit based frequent itemset.Also,it is proved that the expected frequent itemset fulfill the down closure property.It is a good method for pruning candidate itemset.Base on previous concept and pruning theory,a novel algorithm has been designed to mining frequent itemset based on profit constraint.There are two main step for mining process,firstly,mining frequent itemset and pruning with expected transaction count,then validate the mined frequent itemset with profit effectiveness.To improve the performance of mining process,the parallel algorithm is designed based on the technic to split tasks and guarantee the complete of mining result.The parallel algorithm is implemented based on SAP parallel framework.Meanwhile,the non-parallel algorithm and Apriori algorithm are also implemented for testing and comparison purpose.Several groups experiment are performed,analysis of the result is done.The mining result shows that the frequent itemset minning algorithm based on profit constraint could improve average profit of mined itemsets and reduce the quantity of mined itemset effectively comparing with Apriori algorithm.The result also proves that the parallel algorithm could improve the performance of mining process stably.
Keywords/Search Tags:Profit Constraint, Frequent Itemset, Parallel Mining
PDF Full Text Request
Related items