Research On High Utility Pattern Mining Method For Big Data

Posted on:2017-03-17

Degree:Master

Type:Thesis

Country:China

Candidate:Z H Zhang

Full Text:PDF

GTID:2308330482990773

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

With the rapid development of various sectors of the increasing emphasis on data and information technology, more comprehensive data generated while the amount of data is rapidly growing, and the industry also requested timely data has been generated by mining and analysis this makes efficient use of pattern mining techniques more important. Due to the large data it has a massive, real-time and dynamic characteristics, which requires mining algorithms have more time and space efficiency Gao. Although the mode of data mining technology has made some progress, but the efficiency of mining algorithm is still one of the focus of research in the field of data mining.In this paper efficient use pattern mining methods were studied. According to the characteristics of big data problem, and in large data mining algorithms typically faced, this paper presents a big data-oriented and efficient use pattern mining algorithm, which preclude the use of a sliding window method to maintain the data stream to be concerned about the current data and gives a form of data structure and a table structure to maintain the data in the current window so that the structure can be used to excavate the current window efficient undesignated set, but will not be lost under the influence of a data window of data integrity.High utility itemset mining addresses the limitations of frequent itemsetmining by introducing measures of interestingness that reflect the significance of an itemset beyond itsfrequency of occurrence. Among such algorithms, level-wise candidate generation-and-test approaches suffer from the drawbacks of having an immense candidate pool and requiring several database scans. Meanwhile, methods based on pattern growth tend to consume large amounts ofmemory to store conditional trees.We propose an efficient algorithm, called Index High Utility Itemsets Mine (IHUI-Mine), for application to high utility itemsets. The sub-sume index, which has been employed to mine frequent itemsets, is extended in IHUI-Mine to the discovery of high utility itemsets. In addition to the enumeration and search strategies inherited from the subsume index, we introduce a new property to specifically accelerate the computation of transaction-weighted utilization for high utility itemsets. Furthermore,given that bitmaps are used for database representation, the real utility of candidates can be verified from the recorded transactions rather than by resorting to the entire database. The computational complexity of IHUI-Mine is analyzed, and tests conducted on publicly avail-able synthetic and real datasets further demonstrate that the proposed algorithm outperforms existing state-of-the-art algorithms.

Keywords/Search Tags:

Big Data, Hadoop, MapReduce FrameWork, Frequent Pattern Mining, High Utility Itemset

PDF Full Text Request

Related items

1	Research On Novel Methods In Utility Pattern Mining
2	Multi-Relational Frequent Pattern Mining Algorithm And Its Application Research
3	Research Of High Frequent-utility Itemset Mining
4	Research On Frequent And Closed High Utility Itemset Mining Algorithm Based On Spark
5	Research On Privacy Preserving Approaches For Frequent Itemset Mining And High-Utility Itemset Mining
6	Study On Mining Closed Frequent Itemset Based On Hadoop
7	Research On Frequent And High-utility Itemset Mining Algorithms Over Data Stream
8	Research On Key Technologies Of High Utility Itemset Mining
9	Research On Algorithms And Their Performance For Frequent Itemset And High Utility Itemset Mining
10	Research On Frequent-High Utility Itemset Mining Based On Multi-Objective Evolutionary Computation