The Design And Implementation Of Log Analysis System For The User-Oriented Personalization Recommendation

Posted on:2014-07-02

Degree:Master

Type:Thesis

Country:China

Candidate:Y X Zhang

Full Text:PDF

GTID:2308330482483359

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

As the rapid increase of the amounts of the Internet user and information, how to obtain the information quickly from the vast data is becoming one of the most important problems for users, which is also to be the key point for the web site to attract users. The online video service now is the biggest hot spot of Internet application, with which the amount of the videos and the video web site is also increased rapidly. Under this condition, the query of the videos based on the keywords has been unable to meet demand of the users. Recommendation engine, which is used to push the information actively based on the usersâ€™ historical behavior, appear on the Internet.The sharp increase in the amount of users and videos brings new challenges for recommendation system. First of all, the storage of the massive user logs request the storage module of the recommendation system to provide a scalable and reliable service. Secondly, the massive data processing also require a high performance computation. At last, the information which meets the usersâ€™needs better can attract more users. That is to say, the result of the recommendation system must to be designed to archive a high accuracy and validity.This paper proposes a solution based on Hadoop and its series subprojects for the challenges faced by the recommendation system to process massive data-log analysis system for the user-oriented personalized recommendation. The system take advantage of the scalability and reliability features of Hadoop Distributed File System (HDFS) to make Hive as the storage platform of the massive data, which ensure the reliability and scalability of the user log information storage. The system also use the high performance features of Hadoop parallel computing model (Map/Reduce) to make Hive which can convert the SQL statements to the Map/Reduce task as the analysis platform for massive log information, and to make Mahout which is a scalable machine learning algorithm library provides a distributed collaborative filtering (CF) algorithm based on Map/Reduce as the efficient recommendation tool, which ensure the data processing performance. In addition, an optimized modification of the Mahout Source code is used to improve the accuracy and validity of the recommendation result.In order to validate the system, we design a detailed testing scheme. First of all, we prove the availability of the system and the reliability and scalability of the storage model. Then, we verify the improvement in the performance of the data processing and the accuracy and validity of the recommendation result. At last, we prove the actual working effect of the system through building a real experimental environment.

Keywords/Search Tags:

Internet, Recommendation engine, Log analysis, Hive, Mahout

PDF Full Text Request

Related items

1	Research And Implementation Of Recommendation Engine Based On Hadoop Platform And Mahout Framework
2	Research On The Key Technology Of Mahout Music Recommendation Engine
3	Research And Implementation Of Recommendation Engine Based On The Hadoop And Mahout For Intelligent Terminals Cloud Applications
4	Design And Implementation Of Movie Recommandation Of Movie Recommendation Engine Based Mahout
5	A Mahout-based Collaborative Filtering Recommendation Engine: Research And Implementation
6	Research And Implementation Of Shopping System Based On Improved Mahout Recommendation Engine
7	Research And Implementation Of Video Recommendation System Based On Mahout
8	Research And Design On Video Recommendation Technology Based On Hadoop And Mahout
9	Implementation And Evaluation Of Recommendation Algorithms Based On MAHOUT Technology
10	Research And Implementation Of The Distributed Video Recommendation System Based On Mahout