Font Size: a A A

Research And Application Of Improved MGFP-growth Algorithm Based On FP-growth Association Rules

Posted on:2021-01-16Degree:MasterType:Thesis
Country:ChinaCandidate:K WeiFull Text:PDF
GTID:2428330611496828Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the continuous progress of information technology and the explosive growth of data,people have become more and more aware of the value of data.How to find useful information from a large amount of data is particularly important.Association rules are an important branch in data mining.They can be used to discover the relationship between different items in the database and help companies make business decisions.This article focuses on association rules.First,it introduces the related concepts of data mining and association rules.Then it analyzes two classic algorithms,Apriori and FP-growth,and compares the advantages of the two algorithms in finding frequent patterns in data mining Disadvantages.Secondly,a new frequent pattern mining algorithm MGFP-growth(Matrix and Group FP-growth)algorithm based on FP-growth algorithm is proposed.Compared with the shortcomings of FP-growth algorithm,this algorithm has two improvements.On the one hand,the FP-growth algorithm needs to scan the database twice during the construction of the FP-tree,and iterates the result set L multiple times,which reduces the time efficiency.Therefore,this paper proposes to use a two-dimensional matrix to store each transaction by column.Each transaction is segmented and grouped,and the parenttrace relationship is established,which can quickly establish a new tree structure MGFP-tree(Matrix and Group FP-tree).On the other hand,in mining frequent patterns,the FP-growth algorithm needs to recursively generate a large number of conditional pattern bases and FP-trees,resulting in large memory overhead.MGFP-tree is a node that constructs a tree based on grouped terms,reducing Tree branching;MGFP-tree mining is divided into two parts,one is frequent pattern mining of non-empty parent nodes,and the other is adding non-repeating items from non-empty right child nodes to nodesplit of parent nodes for frequent pattern mining.Finally,experiments have proved that the efficiency of the MGFP-growth algorithm is better than that of the FP-growth algorithm.Apply the above research results to real data,use MGFP-Growth algorithm to mine hidden information of Java development related posts on pull hook online,analyze the internal connection in enterprise recruitment information from a multi-dimensional perspective,and provide reference and decision for relevant job seekers.
Keywords/Search Tags:Association rules, FP-growth algorithm, MGFP-growth algorithm, MGFP-tree, Grouping
PDF Full Text Request
Related items