The Research Of Algorithm About Social Network Recommendation Service Based On Hadoop

Posted on:2014-01-19

Degree:Master

Type:Thesis

Country:China

Candidate:Q Ren

Full Text:PDF

GTID:2248330395498022

Subject:Communication and Information System

Abstract/Summary:

PDF Full Text Request

With the rapid development of information technology and network technology, Internethas already entered the Web2.0era. The Internet of Web2.0era becomes more intelligent,personalized and social, influencing and changing people’s way of life, one of the most typicalexamples is SNS(Social Networking Services).Due to the social network always has a huge user groups and users frequently updateweibo, causing social network would produce large amounts of user data every day. How tofind useful information from the user data, and how to provide users with personalizedrecommendation service becomes focus on the direction of social networks. However, thedatas generated by social networks are always large scale data sets, how to deal with this massof data set is one of the more severe challenges.Hadoop is the open source of Google’s cloud computing platform, which is a softwareframework widely applied in industry and academia. It is used for distributed processing oflarge amounts of data with high efficiency, high reliability, high scalability, economicaffordability, and many other advantages.In order to deal with huge amounts of data scalability, using a distributed platform tocomplete social networking service recommendation algorithm is a good choice. Given theinherent mass data storage and processing power of Hadoop, it can effectively solve thedifficulties in safe storage and efficient processing, at the same time it can guaranteereliability, effectiveness and security of the data. In this paper, we put forward building socialnetworking service recommendation system on Hadoop cloud platform.The system is divided into four parts, like data acquisition module, data preprocessingmodule, data storage module and service recommendation module. In the data acquisitionmodule, we use sina weibo API to access to user data. In the data preprocessing module,FudanNLP is adopted to proceed the Chinese word segmentation. In data storage module, webuild HBase tables to store sina weibo data, and use the HBase API to operate the tables. Inservice recommendation module, we implement the distributed TF-IDF algorithm in theMapReduce model, this algorithm is used to calculate the importance of each word in weibo,and to extract the keywords from user’s weibo. According to the keywords extracted from weibo, you can find the user’s interest, and recommend relevant content to the user.In order to verify the accuracy and validity of the distributed TF-IDF algorithm in thispaper, we compare the keywords extracted by the distributed TF-IDF algorithm with thekeywords extracted by the TextRank algorithm for many times. Results show that keywordsextracted by these two algorithms are very close, and with the increasing of keywords’number, the results become more closer. This proves that the distributed TF-IDF algorithmimplemented on MapReduce is accurate and effective. At the same time, due to the distributedTF-IDF algorithm considers the identification problem of keywords, it performs better thanthe TextRank algorithm. In addition, compared with TextRank algorithm of response time, itcan be seen that the distributed TF-IDF algorithm has good scalability.In this paper, the proposed recommendation system based on Hadoop cloud platform hasa certain reference value for data mining application in cloud platform, and has certainexploring significance for recommendation system implementation in cloud platform.

Keywords/Search Tags:

Cloud Computing, Social Networking Services, Service Recommendation, Hadoop, HDFS, MapReduce, HBase, TF-IDF

PDF Full Text Request

Related items

1	Optimization And Application Research Of MapReduce Computing Model Based On Hadoop
2	The Research Of Social Network Friend Recommendation System In Cloud Computing Environment
3	The Design Of The Cloud Computing System Based On Hadoop
4	The Cloud Computing Based On Hadoop Platform And Log Analysis
5	Research And Establishment Of Autonomous Learning Platform Based On Hadoop
6	Research On The Application Of Cloud Computing Based On Hadoop
7	Working Principle And Applied Research Of MapReduce
8	Research And Implementation Of Recommendation System Based On Mapreduce
9	MapReduce Performance Research And Optimization Based On Block Aggregation
10	Research And Application Of The Characteristics Of Distributed Computing Of OSS/BSS In The Cloud Deployment