Font Size: a A A

Research On Community Detection Algorithm And The Application In Code Hosting Platform

Posted on:2018-03-29Degree:MasterType:Thesis
Country:ChinaCandidate:S T GuanFull Text:PDF
GTID:2348330512979737Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the development of Internet technology,the online social media has developed rapidly,such as Micro-blog,Zhihu,Facebook,Twitter,etc,thus forming a huge social network.This kind of network is an extension of the real world,which conforms to certain characteristics of real society.It can reflect people's social attributes and preferences,and how to find the valuable potential communities from such networks has become an academic hot spot in recent years.At the same time,code-hosting-platform based on Git and open source projects is also developing rapidly,and with the participation of more and more developers,a huge developer community has formed.Undeniably,software developers are the backbone of the rapid development of Internet technology.It is of great significance to study how to help developers communicate and collaborate better.This thesis chooses GitHub which is the most popular as the research target,and proposes a method of community detection based on code-hosting-platform.Firstly,based on the data retrieved by GitHub website,this thesis presents a method of building a user model based on the programming language of the user's source code repository,and presents a method to construct network topology,then optimizes the traditional FastUnfolding algorithm,and processes community detection research in the topology.The main work of this thesis includes the following aspects:1.Complete the design of directional web crawler,collect the web data by using the web crawler,and preprocess the web data to get the experimental data.2.Propose a method of building a user model based on the programming language of the user's source code repository,and give the definition between the user models and the method of calculating the weights of the sides,and then construct the topological graph of weighted network.3.Research on traditional community detection algorithm.Aiming at the problem that it ignores some nodes in each community in each iteration,this thesis proposes a new method to calculate the weight between the new two nodes beased on the reconstructed user model.The experimental result shows that the optimized algorithm has a certain degree of improvement in Modularity Q.Based on the statistical analysis of experiments,this thesis proposes a simplified user model.The experimental result shows that the simplified model can obtain higher Modularity Q.4.Design and implement a recommendation system based on the result of community detection,the system implements the visualization of community detection and user models,and recommends the other users who are in the same community and their source code repositories for the user.
Keywords/Search Tags:Social Network, Complex Network, Community Detection, Web Crawler
PDF Full Text Request
Related items