Font Size: a A A

Research On Retrieval Of Target Groups In Social Networks Based On Subgraph Matching

Posted on:2015-02-20Degree:MasterType:Thesis
Country:ChinaCandidate:D ChenFull Text:PDF
GTID:2308330482479130Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
With the rise of web2.0 technology and the vigorous development of mobile internet, social networking platforms and instant messaging technologies develop rapidly, which constitutes social networks with huge numbers of users and complex structures. In vast social networks, people are often interested in specific structures and organizations, named as target groups(such as terrorist organizations, criminal gangs, etc.). How to search target groups in social networks efficiently becomes an urgent problem to be solved. In this paper, the retrieval technology of target groups in social networks are studied, including approximate subgraph matching, local community detection and group importance measure. Major contributions are listed as follows:(1) Traditional approximate subgraph matching algorithms such as SA-Index(Structure-aware and Attribute-aware Index) take steps as follows: matching edges/paths are got by search algorithms first and then connected to get matching graphs. But matching edges first means searching nodes in large graphs and searching many times, which leads to a high computation complexity. Therefore, an approximate subgraph matching algorithm based on neighborhood vectors is proposed. In this algorithm, the matching nodes are determined first by nodes matching, and then search in the extended graphs that contain matching nodes which have better matching effects to reduce the computation complexity. The algorithm is composed of three stages: nodes filtering, nodes matching and edges matching. Firstly, the node label and adjacent nodes labels are adopted to filter nodes in order to get the candidates of query nodes; then, the neighbors of nodes are represented as vectors and a heuristic iterative inference algorithm is adopted to match nodes to get the position of the matching nodes. Processing speed is improved by use of a neighborhood vectors index; finally, extended graphs of better matching nodes are extracted by the neighborhood vectors index and approximate matching graphs are got by breadth-first algorithm in extended graphs. Experiments on datasets such as wiki-Talk and YouTube networks show that the algorithm has high computing efficiency and satisfactory matching results.(2) Aiming at weakening the instability of local community detection algorithms caused by community hierarchies and community overlap, a multi-scale local community detection algorithm based on structural similarities is proposed. Firstly, a new definition of community is given via combining node structural similarities and traditional definitions of community; then, a new definition of local modularity and a multi-scale local community detection algorithm LSC(Local Structural Similarity Community Detection) are proposed based on this definition. Lastly, improvements are made to the LSC algorithm. Taking neighbors of the seed as new seeds to extend communities, local overlapping communities are detected by use of a node pruning strategy. Experiments on real networks such as Karate network, NCAA football network and synthetic networks such as LFR network show that LSC algorithm can detect communities effectively, and can get communities of different sizes. The improved algorithms are found to detect overlapping communities effectively.(3) The existed evaluation indexes of group importance measure only consider one of the factors that affect the performances of measure. To solve it, a comprehensive evaluation method based on MADM(Multi-Attribute Decision Making) is proposed to measure the importance of groups. Firstly, on account of the PageRank algorithm assigning the same transition probability to the neighbors, SNR(Similarity Node Rank) is proposed by introducing node structural similarity to measure the importance of nodes and then extended to SGR(Similarity Group Rank) to measure the importance of groups. Secondly, TOPSIS(Technique for Order Preference by Similarity Ideal Solution) is adopted to measure the importance of groups comprehensively. Experiments on real networks such as Terrorists network, Karate network, and dolphin network show that the comprehensive evaluation method based on MADM considers several factors and implement a comprehensive measure of the importance of groups.
Keywords/Search Tags:Social Networks, Target Groups, Appropriate Subgraph Matching, Neighborhood Vectors Index, Structural Similarity Community, Local Community Detection, Local Overlapping Community Detection, Group Importance Measure
PDF Full Text Request
Related items