Font Size: a A A

Research And Application Of Multidimensional Association Rules Mining Algorithm Based On Hadoop

Posted on:2020-02-06Degree:MasterType:Thesis
Country:ChinaCandidate:P L YuanFull Text:PDF
GTID:2428330578951971Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Association rule mining is an important technology for mining hidden content and knowledge in massive data sets,and it is also one of the hotspots of data mining.With the development of the times,the research on association rules mining has emerged multi-dimensional association rules,and has been rapidly developed,and has been widely used in the commercial field.This paper designs a multi-dimensional association rule mining algorithm based on Hadoop distributed model.Based on the analysis of various association rules mining algorithms,according to the nature of traditional Apriori algorithm,an IApriori algorithm suitable for multidimensional data is designed by pruning strategy,which improves the time performance.Based on Hadoop distributed platform,the parallelization of multi-dimensional association rule mining algorithm is designed and implemented.This algorithm is called Improved Parallel Apriori algorithm,which is called IPApriori algorithm.By parallelizing the algorithm flow and organizational structure,the efficiency of algorithm execution is improved,and the I/O load of the system is reduced.In this paper,the improved multi-dimensional association rule algorithm is applied to the correlation analysis of mobile phone user behavior prediction,and some main factors affecting the behavior of mobile phone users are analyzed.First,you need to clean the data,build a multidimensional data model,and divide and preprocess the experimental data dimensions.Data feature attributes involve multiple dimensions:gender dimension,age dimension,province dimension,city dimension,area dimension,time dimension,mobile brand dimension,and APP type dimension.Then the parallel multi-dimensional association rule mining algorithm is applied to the mobile phone user behavior analysis,and the results are analyzerd to discover the possible existence of mobile phone user behavior and age dimension,gender dimension,time dimension,location dimension and mobile phone brand dimension attribute.Some kind of association.Finally,the IPApriori algorithm,IApriori algorithm and Hadoop-based DG-Apriori algorithm are used to compare the time efficiency of the three algorithms under different transaction numbers and minimum support.The less execution time,the higher the algorithm operation efficiency.The experimental results show that the IPApriori algorithm has the least execution time compared to the other two algorithms in terms of the number of transactions and the minimum support.
Keywords/Search Tags:multi-dimensional association rule, concurrent processing, Apriori, mobile user behavior analysis
PDF Full Text Request
Related items