Font Size: a A A

Research And Application Of Distributed Social Network Analysis Support System

Posted on:2012-08-21Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y HeFull Text:PDF
GTID:2178330335474242Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet and SNS, more and more people communicate through the network, a huge user data have been generated. How to obtain useful information from vast user data in a deeper level, and then mine the potential content, such as diffusion model of network public opinion, attributes of network user group and commercial value, is currently an important research direction and challenge. Traditional social network analysis tools and algorithms are usually stand-alone, their often face inadequate storage and processing capacity problems when handling large data sets. Moreover, the original input data and the schema of social network are unstructured or semi-structured. The traditional relational database is not good at dealing with this type of data, resulting in the use of traditional social network analysis tools and algorithms to handle large data sets more difficult.In this paper, we propose a solution of HBase-based social network analysis support system, for helping users do social network analysis with large-scale data set. It can build the social network from crawled data automatically, users just need to do their analysis and get the result. The main function of the system including obtaining the SNA-required data, extracting network relation, building social network distributely, network storage and distributed social network analysis algorithm design. It has the full implementation of the process for social network analysis.This solution integrated the distributed architecture of HBase and social network analysis process, established a module-based multi-level architecture. We used loosely coupled module design, different module has different function, doing the internal changes in any module will not affect other modules when system's function does not change. In building a social network, the system through open source crawler from the Internet, particularly social network sites to obtain data related to building social networks, and from the non-structural or semi-structured original input data to extract the related social networks, and then using MapReduce distributed build social networks, including network merging, edge property calculation, etc. We design an HBase-based graph storage system in the storage layer of support system for saving social network. Its storage structure is designed by the social network characteristics. It provides necessary graph data for the upper applications. Social network will be abstracted as a graph firstly by the graph expression system when doing social network analysis, and then processing the nodes and edges of graph. Graph expression system provides graph data interfaces and pre-processing functions for distributed social network analysis algorithms. Finally, based on the support system, we introduce node degree, node strength and clustering coefficient analysis as the examples of MapReduce-based distributed social network analysis algorithm design and implementation, and then we taking experiments and analysis of the results.This solution is good at supporting social network analysis for large-scale social network data sets, proved by experiments.
Keywords/Search Tags:Social network analysis, HBase, MapReduce, Support system
PDF Full Text Request
Related items