Font Size: a A A

Algorithms Of Community Detection And Link Prediction In Social Networks

Posted on:2013-05-20Degree:DoctorType:Dissertation
Country:ChinaCandidate:L HuangFull Text:PDF
GTID:1220330392955568Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Social network analysis is the cross-research area in both social sciences and computersciences. With the prevalence of Web2.0, online social network service (SNS) websitessuch as facebook, flicker and livejournal are getting more and more popular on the inter-net. Similar to the traditional social network that consist of dozens of nodes, online socialnetwork can be represented as a social graph G=<V, E>, where V represents the entitiesand E represents the relationship between them. However, compared to the traditional so-cial networks, online social networks have some new features. Firstly, the scale of onlinesocial networks are becoming more and more large due to the scale of the internet. Thesocial networks adopted in early community detection algorithms, e.g. the famous bench-mark social network Zachary karate club (consists of34people in the club), are small socialnetworks. Yet the online social networks such as the LiveJournal consist of tens of million-s members and uncountable relationships. Secondly, the relationships between people arebecoming more complicated. The instantaneous and random interaction between million-s of people have created the uncountable relationship in online social networks. We needsome new technologies and algorithms to handle these situations. In this paper, we focuson the community detection and link prediction in online social networks with some newalgorithms.There are several traditional algorithms for community detection: Graph cut can clus-ter the nodes of the social network into several communities, but need to specify the thenumbers and the scales of the clusters. The hierarchical clustering can handle most commu-nity detection job, however, it can often get diferent output. The modularity methods canfind approximate optimized communities, but the computational cost is very high. Thus,it unsuitable for the online social network. Our work focus on the community detectionin online social network through the introduced regularized spectral clustering algorithm.Similar to the spectral clustering algorithm, the regularized spectral clustering algorithmchooses relatively small sample sets of the given social network. This approach can reduce the computational cost and make it suitable for the online social network.The graph construction is the key step of the regularized spectral clustering algorithm.The traditional graph construction based on the pairwise Euclidean distance needs hugecomputational capability. It doesn’t integrate the overall contextual information of the socialnetworks too. We introduce the sparse1-graph to handle this situation. In addition, arandom walk on the given directed graph for graph laplacian is introduced in our work forthe directed social network.In our work, we also work on the link prediction problem in online social network.The link prediction problem can be formulated as a binary classification problem, and canbe handled using the machine learning algorithm. A sparse classification algorithm calledT runcated Kernel Pro jection Machine, based on empirical feature selection, is proposedbased on empirical risk minimization principal. It is a novel way to realize sparse empiricalfeature-based learning diferent from the regularized kernel projection machines. Some the-oretical theorems are also proposed and analyzed in our work. Experimental results showthat the algorithm outperformed the previous algorithm in several key indices with smallertest errors and more stability.There are many unlabeled samples in online social network. A sparse semi-supervisedclassification algorithm called S el f T raining S emi supervised T runcated Kernel Pro jection Machine, is proposed by using these unlabeled samples for link prediction too. Inthe experiments, more accurate and stable result can be achieved using the proposedsemi-supervised algorithm, and the algorithm convergence in several iterations, thus theunlabeled samples can be used efectively by using this algorithm.
Keywords/Search Tags:Social networks, Regular spectral clustering algorithm, e~1-graph, Truncated K-ernel Projection Machine, Self training semi-supervised Kernel Projection Ma-chine
PDF Full Text Request
Related items