Font Size: a A A

Community Detection By Using Link And Content And It’s Application In Sina Microblog

Posted on:2016-05-04Degree:MasterType:Thesis
Country:ChinaCandidate:J QiaoFull Text:PDF
GTID:2298330467472520Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Many real-world systems can be abstracted as networks, such as personal relationships, the paper quoted net levy, scientists partnership network, microblogging user networks, the Internet, etc. These networks have a common characteristic:complex internal structure. Thus they are called complex networks. Studies have shown that these networks contain potential community structure, which means the links of nodes within a community are dense and links of nodes between communities are sparse. It’s helpful to identifying communities structure of a complex network. It contributes to a deeper understanding of the network and the relationship between nodes and it’s function.However, previous study on community detection mainly focused on identifying communities in networks using only links, or clustering nodes by content (features) of nodes, was lack of comprehensive consideration link and content together. Most of existing methods using both of link and content were probabilistic models. These models had a good mathematic foundation. However, they had high time complexity, and were not easy to understand if users did not have a good theoretical basic.This paper presented a fast K-means type algorithms KRLC (and its improved version2KRLC) and CKRLC which combining link and content together for detecting communities in networks. These algorithms KRLC&2KRLC are suitable for networks with specified number K of communities while CKRLC fits networks without knowing K. These algorithms were based on K-means. These fused link similarity and content similarity together and integrated several initial node selection methods. These proposed algorithms were proved to be efficient K-Means type algorithms for identifying communities in networks with node content.Addtional, we analyzed sina microblog data with four steps including data collection, text preprocessing, network modeling and community detection. Through the first three steps, we established a real network of Sina Microblog user relationships named SURNet. Finally, we identify the community structure of SURNet by the algorithms KRLC,2KRLC and CKRLC. The experiment results shown that the proposed algorithms had better performance.
Keywords/Search Tags:Complex network, Community detection, Link similarity, Contentsimilarity, K-Means, Sina Microblog
PDF Full Text Request
Related items