An Efficient Distributed-computing Framework For Association-rule-based Recommendation

Posted on:2019-02-13

Degree:Master

Type:Thesis

Country:China

Candidate:C S Li

Full Text:PDF

GTID:2428330572455301

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

Since the 1990 s,recommender systems have already attracted much attention from both academic and industrial communities.After decades of development,recommender systems have been widely applied to diverse scenarios from e-commerce sites,to social platforms,video/music websites,etc.Based on how recommendations are made,recommender systems can be classified into a number of categories,including content-based methods,collaborative filtering,hybrid methods,associationrule-based methods,and so on.Among which,association-rule-based recommendation has been considered as the most common approach and many e-commerce firms have embedded rule-based approach into their commercial recommendation systems due to many of its appealing merits,such as easy to understand and supports dynamic recommendation.Existing studies mostly focus on how to select eligible rules to enhance the recommendation performance,but the efficiency of association-rule-based recommendation has been paid few attention.How to efficiently match browsing histories with a set of rules to offer nearly real-time recommendations for massive online users is actually a vital concern of the real e-commerce websites.To remedy this,we present a distributed-computing framework for improving the computational efficiency of rule-based recommendation.The main contributions of this dissertation are listed as follows:1.We propose a distributed-computing framework for improving the computational efficiency of rule-based recommendation.The framework is made up of two modules: frequent pattern mining and association-rule-based recommendation.We exploit special strategies to make sure that they not only work with each other cooperatively,but also can be applied to existing recommendation methods easily.2.We design a tree-typed structure called Ordered-Patterns Forest(OPF)to compress and store frequent patterns.Then,we transform candidate rules mining to a path-searching problem on the OPF,and present a path-searching algorithm running on single machine for improving the computational efficiency of rule-based recommendation.In addition,the algorithm is designed to be compatible with existing recommendation score computation methods so as to ensure the accuracy of recommendation and the applicability of the framework.3.We analyze the factors that affect the execution time of the two modules in detail,and present a load-balanced strategy for data partitioning.On this basis,a load balancing segmentation method are put forward,aiming at reducing the running time of the last finished task,and thus further improving the overall performance.4.We conduct extensive experiments to validate the performance of the proposed distributed-computing framework.Experimental results on three real-world datasets demonstrate that the efficiency improved by the proposed OPF with the path-searching algorithm is higher than 6 times,compared with the traditional Brute-Force method.Meanwhile,the proposed distributedcomputing framework can achieve the nearly-linear scalability along with the increase of computational nodes.

Keywords/Search Tags:

Recommender Systems, Association Rules, Frequent Patterns, FP-growth, Spark, Load Balancing

PDF Full Text Request

Related items

1	Research On Data Mining Technology For Very Large Databases
2	Research On Correlative Algorithms Of Association Rule Mining
3	The Application Research Of Association Rules Parallel Algorithm Analysis Based On FP-Growth
4	Research And Application Of Parallel FP-Growth Algorithm Based On Spark
5	Research On Distributed Frequent Itemset Mining Algorithm Based On Spark
6	Research On Association Rules Algorithm Based On Frequent Pattern Tree
7	Parallelizable Algorithms Research Of Association Rules Mining
8	Research Of Parallel Frequent Itemset Mining Algorithm Based On Spark
9	Design, Implementation And Core Technologies Study Of TCMiner
10	Association Rules Detecting Based On Attribute Topology