Font Size: a A A

Research On Parallel Hybrid Recommendation Algorithm And Its Tools Based On Hadoop

Posted on:2015-09-05Degree:MasterType:Thesis
Country:ChinaCandidate:R WangFull Text:PDF
GTID:2308330485490400Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of computer technology, the Internet has been integrated into the every aspect of people’s life, users can easily obtain a large amount of infor-mation via the Internet. Besides enjoying the convenience, users began to encounter the problem of "information overload", which means users often feel hard to extract really need content from massive information. To some extent, Search engine technol-ogy alleviates the problem through keywords retrieval. However, information retrieval based on search engine technology that often leads to present users a large number of irrelevant content. How to provide users with more personalized requirements closer to the user’s information in the case of information overload has become a hot issue in the development of the Internet. Recommendation system has been considered to be one of the most effective tools to solve the problem of information overload. Rec-ommend problem basically derives from the perspective of users. Help user to assess its preference to objects that have never seen. User is not longer just a passive web browsers, but becomes active participant. Accurate and efficient recommender system-s can obtain user’s preferences and needs, find users potential consumption tendency and provide personalized service.Collaborative filtering algorithm is based on statistical algorithms. Because of the algorithm model’s simplicity, low training complexity, outstanding characteristics, it has been widely used in kinds of recommendation system. By collecting users’ historical behavior information and the calculating of similarity, algorithm can search a neighbor who has the same or similar interests of current user. And then predict the current user’s preference on the item according to neighbors’behaviour.However, traditional collaborative filtering algorithms,also faces some challenges, such as sparse data, accuracy, real-time and scalability. How to deal with these challenges is one of the important problems need to be solved fot recommendation system.The main work in this paper is as follows:1. Differs from traditional similarity measure methods which cause inaccurate prob-lem in data sparsity, a similarity measure based on coupled objects is proposed. The method uses the attribute information of objects including the intra-attributes and inter-attribute to calculate similarity. Intra-similarity and inter-similarity combines to coupled object similarity. Hybrid recommendation algorithm is con-structed by combining coupled object similarity and memory-based collaborative filtering algorithm or model-based collaborative filtering algorithm. The experi-mental results show that the similarity measurement method to construct hybrid recommendation algorithm can effectively improve the accuracy of recommen-dation.2. Because of the massive data environment and the computational complexity of coupled object similarity, algorithm has the problem of scalability. We use the MapReduce method to speed up the computing rate. The experimental results show that the parallel method improved the scalability of.3. For our users can take advantage of the Parallelized recommendation method and other data mining algorithms to handle the massive data more conveniently, this paper would introduce the design and the develop process of a tool box that focus on big data processing. We would introduce the four function modules of this tool. They are platform module, datasets module, algorithm module and task module.
Keywords/Search Tags:Recommendation Algorithm, Parallellezation, Coupled Object Similarity
PDF Full Text Request
Related items