Font Size: a A A

Improvement And Application Research Of High Utility Pattern Mining Algorithm Over Data Stream

Posted on:2019-04-20Degree:MasterType:Thesis
Country:ChinaCandidate:Z X XieFull Text:PDF
GTID:2428330596466429Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the advent of data stream,more and more research is done over data stream,including high utility pattern mining.Most of the high utility pattern mining algorithms are based on two data structures: global head table and utility tree.These algorithms' global head tables will be redundant if they are applied to data stream.And this problem is bad for the algorithm's efficiency.Besides,most of the high utility pattern mining algorithms are only aimed at non-negative utility situation.These algorithms will be inefficient because the primary itemset utility-estimation method is invalid when they are used in the case of negative utility.This problem is harmful to the algorithm's availability.To mine high utility patterns over data stream efficiently and make high utility pattern mining algorithm more applicable,a lot of researches on high utility pattern mining algorithm over data stream has been done in this thesis.And the main work can be summarized as follows:(1)To solve the redundant problem of global head table,a strategy is proposed in this thesis to compress the global head table availably.With this strategy,a high utility pattern mining algorithm over data stream called IHUM-UT(Improved High Utility pattern Mining based on Utility Tree)is proposed as well.Through removing the irrelevant items from the global head table,time consuming for traversing the global head table can be reduced so that the efficiency of IHUM-UT can be improved.The experimental results show that IHUM-UT is efficient under the condition of same mining results.(2)Facing the circumstances of negative utility,most of the high utility pattern mining algorithms are inefficient because the primary itemset utility-estimation method is invalid.To improve those algorithms' efficiency,a new itemset utilityestimation method called FEU(Forward Estimated Utility method)is proposed in this thesis,which can be used in the case of non-negative utility as well as negative utility.Compared with the traditional transaction weight utility method,forward estimated utility method also has the downward-closure-property.And in this method,itemset's estimated utility is closer to its real utility so that the method can exclude the unpromising itemset efficiently.The experimental results show that forward estimated utility method can exclude more unpromising itemset than transaction weight utility method under the same parameters.(3)To show how high utility pattern mining algorithm works in application,a real time prototype system for guiding commodity sale is designed and implemented in this thesis.The system mines high utility patterns through using IHUM-UT and FEU to analyze shopping list stream.The system gets the high utility pattern support count of each commodity in a statistical cycle and uses it to reflect sales status.According to the sales status of each commodity,the system can show some commodity sale advices to businessman in real time for maximizing the profit.
Keywords/Search Tags:Data Stream, High Utility Pattern, Head Table Compression, Itemset Utility-Estimation Method
PDF Full Text Request
Related items