Font Size: a A A

Design And Implementation Of Real-time Microblogging Storm Based On The Recommendation System

Posted on:2017-07-30Degree:MasterType:Thesis
Country:ChinaCandidate:B K FuFull Text:PDF
GTID:2348330518496230Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the advent of the mobile Internet era,the majority of Internet users to obtain advice and experience more convenient.Social networking services have also been the spurt of development,and become an integral part of people's daily life.Microblogging is a social networking platform to get information sharing,dissemination and access based on users' relationship.Sina microblogging,for example,at present day there are about tens of millions of active users,and generating hundreds of millions of micro-blog every day.Faced with such a flood of information,it has become an urgent problem to provide users with interesting hot topics or microblog timely.This paper studies the design and implementation of real-time microblogging recommendation system based on storm,the main works include the following three aspects:Firstly,survey the research progress in real time recommendation system,and the system involving related technologies,including hadoop distributed computing framework,storm real-time distributed computing framework,kafka distributed publish-subscribe messaging system,and the sliding window model.Secondly,the overall structure of real-time microblogging recommendation system has been designed,including data collection,data offline processing,real-time data processing,data storage and data presentation of five subsystems.The data acquisition subsystem includes two modules of microblogging API and reptiles;the data off-line processing subsystem is mainly based on hadoop and it uses an improved vector space model that is based on Newton's law of cooling added time factor,modeling users' interest by microblogging historical data,computing users' interest vector;While,real-time data processing subsystem mainly analyzes the users,real-time behaviors of microblogging display to update the user's interest model and complete the microblogging improved model based on the sliding window of popular keywords calculation as well as the user real-time microblogging recommendation;data storage subsystem is mainly a variety of data storage systems;data display subsystem mainly shows the users to subscribe to it,and recommend the most popular keywords and microblogging to users.Based on storm,hadoop and kafka platform,design and implement various subsystems.Finally,set up a test platform for system performance and functional testing,and functions of the system main modules are analyzed and validated.The results of system meet the design requirements.The system uses a distributed architecture with high availability,high scalability,strong computing power and other characteristics.It can help facilitate efficient use of microblogging and provide better personalized recommendation service.
Keywords/Search Tags:micro-blog recommended, storm, hadoop, sliding window model, data mining
PDF Full Text Request
Related items