Font Size: a A A

Design And Implementation Of Social Network Analysis System Based On Big Data

Posted on:2021-09-21Degree:MasterType:Thesis
Country:ChinaCandidate:Y H XuFull Text:PDF
GTID:2518306050970759Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
With the development of the Internet and the continuous upgrading and improvement of social software,people are more and more willing to migrate their daily lives to the Internet,and the communication between people is becoming more and more dependent on social media.Social media such as Facebook,Twitter,and Linked In continue to expand their user base and have become an integral part of life.Massive amounts of data are being generated in social networks at all times,and the traditional stand-alone model cannot meet today's requirements for data processing capabilities.The mining and analysis of large-scale social network data is of great significance,so the analysis and research of social networks has become a focus of attention from all walks of life.This thesis aims to solve the difficult problems of social network data acquisition in the field of cyberspace security,the difficulty of managing big data components,and the lack of pertinence of data analysis and mining algorithms.Based on big data technology,The social network analysis system including data collection,data storage,data calculation,data mining analysis and data visualization functions had been designed and implemented,which has good scalability and practicality.The research contents of this thesis are as follows:System development: This analysis system designed in this thesis uses Facebook,Twitter,and Linked In as data sources to design and implement relevant anti-crawling measures.Continuously collect data through distributed intelligent crawlers as the data foundation for upper-level applications.Use big data technology as the basis of data storage,calculation,and analysis mining,build a big data platform based on the HDP big data platform architecture,including distributed file system,Spark distributed computing framework,multiple distributed databases and other big data components,to provide tool for data storage,calculation,and analysis mining of social network analysis systems.According to Spark's Graph X and ML libraries,the basic feature analysis of social networks includes degree distribution,association analysis,shortest path analysis,and user clustering are realized.Algorithm improvement: After synthetically analyzing the shortcomings of the traditional Page Rank algorithm and various community detection algorithms.Combining the traditional algorithm and the actual data of the social network,the key node mining algorithms tw Rank and fb Rank based on different social networks are proposed.And comparison of design experiments,verification shows that the new algorithm can make better use of the collected data to accurately mine key nodes than the traditional algorithm.In view of the defects of the traditional Louvain community detection algorithm,an improved distributed Louvain community detection algorithm is proposed,and its advantages in processing large amounts of data are verified through experiments.System visualization: This system uses the Django framework and visualization technology to realize the visual display of data management and data analysis and mining results.
Keywords/Search Tags:Big data, Social Network Analysis System, Distributed Crawler, Key Node Mining, Community Detection
PDF Full Text Request
Related items