Font Size: a A A

Research And Implementation Of Social Network Analysis System On Mapreduce

Posted on:2011-07-08Degree:MasterType:Thesis
Country:ChinaCandidate:C YangFull Text:PDF
GTID:2178360308462426Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the Internet coming to the era of web 2.0, there emerges more and more social net sites. These social net sites imitate each other and copy the idea of others. As a result, the services provided by these social net sites have a high degree of similarity, increasing the homogenization of these social net sites. Therefore, the social net sites need to analyze the characteristics of their users, provide the services according to the characteristics of the users allowing users to get a better service experience. While the traditional data warehouse-based social network analysis tools get difficulties in the user data management and data analysis facing the difficulty to manage heterogeneous data and data size that could be analyzed is too small. This thesis presents a solution for the research and realization of MapReduce-based social network analysis system.In this thesis, the design of a social network analysis system based on the MapReduce is proposed, including data acquisition required for social network analysis, data format transformation, graph processing and the design of social network analysis algorithm, realizing the complete process of social network analysis. Data acquisition uses the Web crawler to capture data from a social net site. According to the analyzed characteristics of Web site URL, the web crawler configuration file is set, to achieve the precise content crawling.The design of MapReduce-based social network analysis system includes the graph processing system. In order to do social network analyze, the social network is abstracted into a graph. To process the data in the graph, the graph processing is designed to provide processing ability for the social network analysis algorithm. The graph processing system provides data format transformation from the crawled data format to graph data format.MapReduce-based social network analysis system needs to realize social network analysis algorithms, which are used to analyze social network. In this thesis the design principle and the design of data structure of MapReduce-based social networking algorithm are proposed. The betweenness centrality algorithm is illustrated as an example to introduce the detailed design and realization process of MapReduce-based social network analysis algorithm.The MapReduce-based social network analysis system presented in this thesis has been tested verification to run well, suitable for the social network analysis of the large scale data of the social net sites.
Keywords/Search Tags:MapReduce, social network analysis, web crawler, Betweenness Centrality
PDF Full Text Request
Related items