Font Size: a A A

Data Analysis Platform Of Social Network Based On Spark

Posted on:2019-12-19Degree:MasterType:Thesis
Country:ChinaCandidate:H L WangFull Text:PDF
GTID:2428330542996828Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development and application of Internet technology and big data technology,people are more willing to move their daily lives on the Internet,and communication among people relies more on the Internet.Facebook,Sina Weibo,WeChat and other social applications rise and become an indispensable part of people's live.People communicate with each other through social application,which as a medium,so producing a relationship network.The relationship network among people is called social network,and to some extent online social network extends the application field of mobile Internet.In the rapid development of the Internet today,social networks produce massive user data all the time,and just Sina Weibo will produce more than one hundred million of the user data every day,and big data will have a profound impact on every aspect of our life.The development of Internet technology has laid the foundation for the development of social networks and big data,and huge commercial value is hidden behind massive data.Because social network is closely related to people's social life and objectively reflects the state and characteristics of people's social circles,it has some practical significance to analyze and excavate social networks.This paper designs and implements a data analysis platform based on a popular social network,which includes data acquisition,data storage,data calculation and analysis and data visualization function.The data analysis platform proposed in this paper can be divided into four parts according to its function:Data crawling,which use distributed architecture to efficiently crawl social network data.This function provides the available data base for the data calculation and analysis module on the upper level of the platform;dataset management,which can be used for users to manage their data set.The data set could include local files and HDFS files.By building HDFS cluster the data analysis platform can provide fault-tolerant storage services for users;data calculation and analysis algorithms,by using the GraphX components in the Spark the data analysis platform compute the graph data in parallel,and we design social network location algorithm to locate specific users.It can get the geographical location of users more accurately.Through NetworkX package and Amap SDK data visualization can realize the visualization of graph structure and trajectory data.The platform will encapsulate the above modules into a friendly web application,and utilize front-end technology to visualize the function modules and data visualization results.
Keywords/Search Tags:Crawler, Social Network, Data analysis platform, Spark
PDF Full Text Request
Related items