Font Size: a A A

Research On Association Rules Mining Algorithm In Big Data Background

Posted on:2019-03-26Degree:MasterType:Thesis
Country:ChinaCandidate:G Q DengFull Text:PDF
GTID:2428330572995098Subject:Software engineering
Abstract/Summary:PDF Full Text Request
As society enters the information age,a great deal of information and data has become a feature of this era.Mining hidden information in data becomes particularly important,and association rule mining helps to determine the relationship between objects in the database and plays an important role in many decision systems.However,people's understanding of big data is still not enough,and there is a lack of understanding of association rules mining.Whether domestic or foreign,there is a wide range of research prospects in this area.This paper mainly studies association rule mining algorithms,the main work is as follows:First,it summarizes and classifies the existing association rule mining algorithms and related improvement research,analyzes the existing problems of existing association rule mining,and forecasts the future research work.Second,a weighted association rule mining algorithm based on matrix compression is proposed.Through one database scan and conversion to 0-1 matrix,the problem of multiple scans of the database is avoided.Then,the matrix is compressed according to the relevant properties.,Reduce the amount of calculations in the implementation of the algorithm;At the same time,taking into account the project has different importance,a weighted approach is taken;and this algorithm can directly find high-order frequent itemsets in the mining process.Experimental results show that the algorithm can effectively improve the mining efficiency of association rules.Thirdly,a heuristic search-based parallel association rule mining algorithm is proposed.Bitmap ranking is used to improve the search efficiency of maximal frequent transactions.Greedy mechanism is introduced to ensure the superiority of each phase of the algorithm.Heuristic search is used to ensure Mining the efficiency and reliability of the largest frequent itemsets;parallel execution of algorithms in the Spark platform to further improve the efficiency of the algorithm.Experiments show that this algorithm has better efficiency and accuracy in mining association rules.
Keywords/Search Tags:Big Data, Association Rule Mining, Matrix Compression, Parallelization, Maximum Frequent Itemsets
PDF Full Text Request
Related items