Font Size: a A A

Improvement Research Of Dynamic Mining High Utility Itemsets Algorithm

Posted on:2017-02-26Degree:MasterType:Thesis
Country:ChinaCandidate:N GeFull Text:PDF
GTID:2308330485989384Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the extensive application of database technology and network technology, data storage sharply promotion, resulting data is no longer static, but it gradual accumulation of change. Unlike traditional utility mining, transaction sets in the data changes over time, the updated data is more important compared to before, the utility mining is proposed higher demands and challenges on how to properly consider these variables and quickly mining the real useful information and knowledge.The main content of this paper is to study the problem of dynamic mining utility.By analysis the traditional incremental updating algorithm which needs an iteration for each layer, multi-scanning database and generating huge sets of candidates, the overhead is huge, and only able to handle the increased transaction sets. They can not effectively deal with transaction set of delete or modify. Thus this article from the view of improving the performance of high utility itemsets of dynamic mining perspective, mainly carried out the following research:(1) It analysis the advantage and shortage of current high utility sets mining algorithm.In view of the original utility incremental mining cannot effectively deal with deleting and modifing.Based on the concept of pre-large, First calculate changes in business focus on each transaction weighted utility value, then the original transaction set is divided into large items, pre-large items, and small items and put the changed affairs set divided into negative iterms, zero ierms and positive iterms.And the theory of utility thresholds is deduced.(2) It propose the PreHU-tree algorithm based on the concept of pre-large itemsets to mining high utility itemsets in changed transaction set.It applys the transaction utility safety threshold and pre-large itemsets to reduce the number of rescaning the database. At the same time on the basis of prelarge-tree structure it increases the sum of transaction weighted utility prefix itemsets chain table which form a new data structure called Pre HU-tree, using the PreHU-tree structure to reduce a lot of useless candidate frequency items which is generated from 1-frequency sets to n-frequency sets.It combined with the prefix itemsets chain table of sets support and the external utilityof items to mine high utility itemsets in dynamic database, avoiding the generation of non-high utility itemsets, while using the threshold of utility value to branch, effectively reduces the search space.(3) According to PreHU-tree algorithm on the updated PreHU-tree from bottom to top that search the sum of transaction weighted utility prefix itemsets chain table which contain the subset of prefix itemsets, when mining short transaction set is very efficient, but the sets in transaction is large, it will generate a lot of subsets.It proposes dynamic mining high utility itemset hybrid algorithm called mix-PreHU.PreHU-tree algorithm is used short transactions set mining, mining long transaction set generating an optimized subset of items set intersect method, the sum of transaction weighted utility prefix itemsets chain tables effectively avoid repetition of the same subset. Experiments show that when the dynamic mining long transaction sets can effectively improve the overall efficiency of the algorithm.Finally, based on the conclusion of this article, analysis of the shortcomings of the proposed method and proposing a clear direction for further research.
Keywords/Search Tags:utility mining, pre-large item, dynamic mining, PreHU-tree, Mix-PreHU
PDF Full Text Request
Related items