Font Size: a A A

Design And Implementation Of Recommendation System Based On Hadoop

Posted on:2017-06-17Degree:MasterType:Thesis
Country:ChinaCandidate:X FangFull Text:PDF
GTID:2348330491951723Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Today's smart phones have all kinds of software, which push vast amounts of information to us every day. Not all of the information has high quality. How the user can access to the needed information, and can avoid being troubled with the rest, is the problem that recommendation system should solve. With the explosion of data volumes, traditional recommendation systems have been unable to meet the requirements. Recommendation system based on cloud computing platform is capable of handling a greater amount of data, running more complex recommendation algorithm, and, of course, providing more efficient recommendation service.This paper presents a new parallelization recommendation algorithm based on Hadoop, which is called H-ICSR(Item Clustering Based Social Recommendation Algorithm on Hadoop). It provides recommendations by adopting the idea of social recommendation, using history ratings and social data to produce results.H-ICSR cluster the items by attribute data at first, dividing them into several categories. Secondly, calculating the preferences of certain user for different categories through rating data. Next, a model based on interaction data and ratings is set up to calculate the recommendation score of item. Finally, predicting a user's ratings for unknown items through combining the user preferences for a category with the recommendation scores of the items in the category. A user's recommendation list is generated by sorting the candidate items by predicted scores in descending order.A distributed implementation of H-ICSR is made on Hadoop, so H-ICSR is able to run on Hadoop clusters, making full use of computing resources. The application is programmed in MapReduce framework, and the results is stored in HDFS. We use 4 jobs which are controlled by JobControl in H-ICSR. Experiments show that H-ICSR algorithm has better performance than other algorithms when encountering a cold-start problem or data sparse problem.H-ICSR constitutes the core module of recommendation system, which is divided into 4 layers: The Source Data Acquisition Layer(SDAL), The Data Pretreatment Layer(DPL), The Recommendations Generator Layer(RGL) and The User Access Layer(UAL). The transmission of data between relational database and HDFS is in SDAL, and a large number of computation and modeling is in DPL. RGL is responsible for generating personalized recommendation list. In UAL, recommendations are shown and users' feedback is recorded.We integrate the recommender module into existing projects, and implement a prototype of the recommendation system on hadoop. The system is composed of three parts: the android client, the web server and the hadoop cluster. The android client interacts with users, the web server responses the request from client at real-time and the hadoop cluster runs H – ICSR offline. This system can provide users with personalized recommendations through android devices.At the end of this paper, a functional test on the prototype system is described in detail. Results show that the system strong enough and can provide users with personalized recommendation.
Keywords/Search Tags:social recommendation, recommendation system, clustering, parallel computing
PDF Full Text Request
Related items