Font Size: a A A

Design And Implementation Of Weibo Data Analysis System Based On Spark

Posted on:2019-08-19Degree:MasterType:Thesis
Country:ChinaCandidate:P F ShenFull Text:PDF
GTID:2428330542998386Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
In recent years,social networking has flourished,and as the mainstream platform,Weibo has seen an increase in the number of users and business types in the network.The amount of data generated by Weibo websites has shown an explosive growth trend.The huge amount of data in Weibo contains a wealth of information treasures,and the big data technology can be used to find network public opinions and social hot topics,get Weibo users' interests and emotional appeals,understand the transmission mechanism and rules of Weibo information.As the most popular distributed computing framework for big data analysis,Spark can dramatically increase the speed of processing Weibo data.This thesis first introduces the key technologies for big data analysis.Then the main features of Weibo data are analyzed based on Weibo's core business,and the research status of Weibo data analysis is summarized from many aspects,and then the application situation of Weibo data analysis is introduced in order to explain the potential value of Weibo big data analysis.The key work of this thesis is about the design and implementation of Weibo data analysis system based on Spark.The system architecture is designed based on quick collection of Weibo data,effective storage of system data,varied analysis tasks and image display of analysis results.It also considered big data analysis and the characteristics of Weibo data.Furthermore,all function modules of the system are designed in details,including data acquisition module,data storage module,data analysis module and data visualization module.It described the design ideas and specific processes.This thesis implemented the Weibo data analysis system based on the system's overall architecture and the specific design of each module.It finished data acquisition module with the use of distributed crawler framework and Flume,achieved data analysis module with Spark application programming,completed the data storage module through the HDFS and HBase,and build data visualization module based on Java Web framework.In the end,this system not only satisfied the needs of big data process,but also had the ability to complete a variety of analysis tasks for micro-blog data.
Keywords/Search Tags:big data, Spark, Weibo, data analysis
PDF Full Text Request
Related items