| There are different types of graph data in real-life applications,such as proteinprotein interaction networks,communication networks and collaboration networks.In these graphs,communities typically exist as groups of vertices that are densely connected.Mining cohesive subgraphs from a network is a fundamental problem in network analysis,and it is very useful in numerous applications including community discovery,protein complex mining,spam detection and so on.In various graph data applications,there is a kind of graph whose edges contain positive and negative information,in which the posivtive edge represents the “friend”,the negative edge represents the “enemy”.In recent year,the analysis and processing of a signed graph has been attracted much by academics.Unfortunately,most existing cohesive subgraph models are mainly tailored to unsigned networks.Therefore,it is improtant to define a new dense subgraph model for signed networks.Firstly,we propose a novel dense subgraph model in signed networks in this thesis,which is called maximal(α,k)-clique.Specifically,a maximal(α,k)-clique is a clique in which every node has at most k negative neighbors and at least ?αk ?? ? positive neighbors(α ≥1).We focus on two fundenmental problems to find dense subgraphs in signed networks:(i)enumerating all maximal(α,k)-cliques,(ii)finding the top-r maximal(α,k)-cliques.To enumerate all maximal(α,k)-cliques efficiently,we first develop an elegant signed network reduction technique to significantly prune the signed network,including a basic pruning technique MCCore(maximal constrained ?αk ?? ?-core)and a new pruning technique MCNew(new maximal constrained ?αk ?? ?-core).The striking feature of MCNew is that its worst case time complexity is O(δm),where δ is the arboricity of the signed graph,m is the number of edges.Note that the arboricityδ is bounded byO(m),and it is often much smaller than such a worst case bound in real-life graphs.Then,we present an efficient branch and bound enumeration algorithm with several carefully-designed pruning rules to enumerate all maximal(α,k)-cliques in the reduced signed network.To facilitate the enumeration of top-r maximal(α,k)-cliques,we propose several new pruning technologies in the enumeration process,which greatly improve the performance of the top-r maximal(α,k)-cliques enumeration algorithm.The results of extensive experiments on five large real-life datasets demonstrate the efficiency,scalability and effectiveness of our algorithms.The most impressive result is that our algorithm takes less than 1000 seconds to enumerate all maximal(α,k)-cliques under most parameter settings in a signed network with more than 1.6 million nodes and 30.6 million edges.To measure the quality of a cohesive subgraph,we propose a signed conductance metric in signed graph.We show that our model consistently outperforms the other models in terms of the signed conductance metric.And we also use a specific case study to evaluate the effectiveness of our model and prove the superiority of the model proposed in this thesis. |