Font Size: a A A

Research On Community Mining Algorithms

Posted on:2009-12-23Degree:MasterType:Thesis
Country:ChinaCandidate:H CaiFull Text:PDF
GTID:2178360242480268Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the coming of the age of information, commerce, telecom, medical treatment and other industries are being confronted with a large amount of data more and more frequently, as a result, data mining and knowledge discovery technology come into being and develop considerably. They extract credible, innovative, effective models which can be understood easily and discover the potential and significant information which has decision-making in massive data, providing people with a higher level of data analysis function and good automatic decision support. Applying data mining technology to social network analysis is a new direction in the field of knowledge discovery and data mining research and has received a lot of attentions.The theoretical foundation of social network derives from Six Degrees of Separation and rule of 150. Different from the objects on which traditional data mining technology research, in social network analysis, research will be focused on the relationship mining, because there are mutual ties which are called relationship between individuals, therefore, the individuals are not satisfied with the assumption of independence. In the view of data mining, a social network is a heterogeneous and multi-relationship dataset modeled by a graph, where the nodes represent individuals, and an edge between nodes indicates that a direct relationship between the individuals. Community structure is a common feature of social networks, showing that the density of ties between individuals within a community is greater than those outside it. The whole social network is formed by the communities with the above characteristic, and mining community structure has important significance for understanding and analyzing the features of social network, so that community structure mining technology has become a heated topic.In this paper, we have made a profound research on the social network analysis and community mining algorithms, moreover, based on social networks'three characteristics: static, dynamic, heterogeneous, we propose three novel community mining algorithms and put forward them to solve the problems of community structure mining in social network with certain properties, including:1. In this paper, by reading a number of related references and implementing many classic algorithms such as Girvan and Newman's algorithm, Newman algorithm, Normalized Cut algorithm, Kernighan-Lin algorithm, CONGA algorithm et al, we fully understand the concepts, principles, techniques, methods of social network analysis, more over, the significance and research status at home and abroad of community structure mining technology in social network analysis.2. In this paper, we analyze single, static social network and propose an overlapping community structure mining algorithm. Based on the principle of agglomerative method,"greedy"optimization, modularity, we improve Newman algorithm to mine overlapping community structure in social network. Experiments on simulated data and standard test data show the algorithm can obtain high-quality community mining result with low time complexity, and the community structure is more in line with the complex relationship in the actual network. The innovation is: 1)Compared with the previous method, mainly targeted at the overlapping structure of the mining community, making the results more in line with actual demarcation in the network of complex relationships; 2)Proposes a copy node operation, according to node betweenness and the threshold to determine whether the node to be merged needs replication; 3)Compared with Newman fast algorithm, OCSMA proposes different rules to amend the symmetric matrix e , making the individuals in the network divided into many different communities according to the actual situation.3. In this paper, we analyze dynamic social network and propose a dynamic community structure mining algorithm. Based on the principle of connectivity and frequency, we introduce a novel definition of community to mine the set of individuals which retain connectivity at any successive moment, in addition, we adopt the samdwith model and distinguish the individuals according to their importance in order to make the structure framework of community more clear. Experiments on simulated data show the algorithm can obtain high-quality community mining results with clear structure. The innovation is: 1)Compared with the previous method, taking into account the characteristics of social networks - dynamic, a rare geostationary, to conduct an overall analysis on a number of dynamic social networks; 2)Proposes a new definition of community structure: a series of individual sets maintain connectivity at any moment in dynamic network diagram; 3)Distinguish the importance of the individuals in the community and adopt the sandwich model to show, making the mining community results more clear.4. In this paper, we analyze heterogeneous social network and propose a multi-relationship community structure mining algorithm. Based on correlation analysis, we remove redundancy relations to get the most effective relationship subset, in addition, we use user's query information as prior knowledge to distinguish the different importance of the relationship and extract a composite relation, then, according to this obtained relation to mine community structure in order to make the results of social network analysis more in line with the expectation of users. Experiments on simulated data and standard test data show the algorithm can obtain high-quality community mining result with low time complexity, and the community structure is more in line with the complex relationship in the actual network. The innovation is: 1)Compared with the previous method, taking into account there are various relations between individuals in the network, to analyze many different relationship network diagram; 2)Taking into account the relevance between the individuals in the original set of relations, to dispose the Redundant relations and improve analytical capabilities of algorithm. 3)Taking into account different users has different degrees of interest with relationships in the network and using user queries as prior knowledge, to extract a new combination of relations, making the mining community results more in line with user needs.Social network analysis and community structure mining are very meaningful and full of challenge in the field of data mining with broad applications, it can discover the relationship between the characteristics of network topology and the trends of network acts, in order to help people make more effective use of information or other resources to make a better judgement, management and decision support in the field of human life, commercial production, natural areas and so on. The three algorithms proposed in this paper greatly improve the traditional algorithms in terms of quality, complexity and so on, it is expected that more studies on them could appear future.
Keywords/Search Tags:Algorithms
PDF Full Text Request
Related items