Font Size: a A A

Generalization Of Random Block Model And Its Application In Complex Network Analysis

Posted on:2022-06-24Degree:DoctorType:Dissertation
Country:ChinaCandidate:D X MoFull Text:PDF
GTID:1480306347951689Subject:Mathematics
Abstract/Summary:PDF Full Text Request
Complex network analysis,as one of the emerging disciplines,has greatly attracted the attention of researchers from different fields.Qualitative and quantitative analysis on complex networks can help reveal the potential characteristics of complex networks and the common laws in complex systems.It is of great significance in many disciplines such as biomedicine,social sciences,and financial engineering.The community detection on complex networks is one of the effective ways to reduce the complexity of complex systems.It helps people better analyze and understand the structure of individual organizational characteristics in complex systems,and have a deeper understanding of the evolution mechanism of complex systems,is an effective means for people to improve and utilize complex systems.As one of the most influential statistical network models,the stochastic block model is widely used in community detection task on complex networks.In recent years,scholars have proposed a variety of generalized stochastic block model according to the community detection problems of different complex networks,such as mix-membership stochastic block model,overlapping stochastic block model and degree corrected stochastic block model.However,in the current generalized stochastic block model,few scholars consider the stochastic block model for sparse,weighted and special networks.For some complex network structures,such as multilayer networks,multi-objective networks and dynamic networks,there is still a lack of effective research tools.In view of the shortcomings of the existing research work,this work reviews the latest literature and mainly discusses the generalization and theoretical properties of stochastic block model under the multi-layer network,multi-subject network and dynamic network,and applies the propopsed model to detect the underlying structure in complex networks.The main contributions of this article are as follows:1.We studied the generalization of stochastic block model in multi-layer networks setting and the community detection problems,proposed the multi-layer weighted stochastic block model and the restricted multi-layer weighted stochastic block model,and established their connection with the binary multi-layer stochastic block model.When the number of communities in the network is large,the restricted weight random block model with more economical parameters has more advantages than the weighted multilayer random block model.When the number of communities in the network is large,the restricted multi-layer weighted stochastic block model is more advantageous than the weighted multi-layer stochastic block model.We present a method based on likelihood ratio testing for selecting multi-layer weighted stochastic block models,which can be used to select appropriate weighted stochastic block model in empirical research.In addition,we derive the variational expectation-maximization algorithm to estimate the parameters of interests.On the theratical side,we derive an exact recovery bound on the probability of misclassification in the framework of multi-layer weighted stochastic block models.We prove that the exact recovery bound is controlled by the summation of Renyi divergence of order 1/2 between distributions of withinand between-community across all network layers.Thus the summation of Renyi divergence arises as a fundamental quantity measuring the difficulty of community detection problem.In the simulation studies,we clearly point out the advantages of our proposed model over the other four competing models(weighted stochastic block model based on average adjcency matrix,weighted stochastic block model based on voting algorithm,spectral clustering based on average adjcency matrix and spectral clustering based on voting algorithm).We prove that our model is more robust and effective than other methods when some layer network have weak information on community structure or the signal of networks changes across layers.Finally,we examine the model on financial network,computer department network and bicycle-sharing network.2.We studied the generalization of stochastic block model in multi-subject networks setting and the community detection problems.Motivated by the idea of the generalized linear model,we establish the relationship between the intensity matrix and the individual characteristics,and propose a general modeling framework for the multi-subject stochastic block model.The multi-subject stochastic block model proposed in this work is suitable for analyzing binary and weighted,sparse and dense multi-subject networks,and can describe the influence of individual characteristic variables on the network community structure.In order to investigate the impact of individual characteristic variables on the network community structure,we classify the multi-subject stochastic block model into homogeneous multi-subject stochastic block model and heterogeneous multi-subject stochastic block model.In the homogeneous multi-subject stochastic block model,the individual characteristic variables have the same influence on each community,while the heterogeneous multi-subject stochastic block model is different.In order to estimate the parameters of multi-subject stochastic block model and determine the number of communities in the network,we derive the variational expectationmaximization algorithm and the ICL criterion for selecting the number of communities.In terms of theory,we explore the consistency of multi-subject stochastic block model,that is,when the number of communities increases with the number of nodes,increasing the number of network nodes and increasing the number of subjects will make the log-likelihood of models converge to its expectation.We compare the proposed model with five competing models(Binomial weighted stochastic block model,fast louvain algorithm,newman spectral algorithm,fast louvain consensus algorithm,newman spectral consensus algorithm)under the multi-subject networks with homogeneous,heterogeneous and core community structure.The simulation results show that the proposed model performs better than the monolayer stochastic block model based on accumulated information and the methods based on modularity optimization;the weights on edges can provide additional information for community detection;the threshold method transformed the weighted network into binary network is only applicable to networks with homogeneous and heterogeneous community structures,and not to networks with core community structures.In the end,we apply the binary and weighted multi-subject stochastic block model to the functional brain networks of ADHD patients,and partition the regions of interest into different communities.After studying the influence of individual characteristic variables on the community structure,it is found that the age characteristic of the individual has a significant influence on the connection probability and intensity of the community structure.Therefore,our proposed model can discover the influence of individual characteristic variables on the community structure while discovering the community structure in multi-subject networks.The research results enrich the statistical tools for studying multi-subject networks,especially in the field of neuroscience.3.We studied the generalization of stochastic block model in dynamic networks setting and the community detection problems.we propose a dynamic stochastic block model with node attributes.The proposed model is suitable for analyzing binary and weighted,sparse and dense dynamic networks.In addition,we futher relax the assumption that the community to which a node belongs does not change over time,and propose a dynamic stochastic block model with node attributes and node changing communities over time.In order to estimate the model parameters and community assignments,we present the variational expectation-maximization algorithm of the dynamic stochastic block model with node attributes.In order to avoid subjective selection of the number of communities in the dynamic network,we give an ICL criterion for determining the optimal number of communities based on the observation network data.In theory,we study the consistency of the dynamic stochastic block model with node attributes,and explore the relationship between the number of network communities,the number of nodes,and the length of time span.Under different scenarios,we analyze the community detection ability of the dynamic stochastic block model with node attributes and the dynamic stochastic block model without node attributes.The simulation results show that the dynamic stochastic block model with node attributes can effectively improve the ability of community detection,especially when the network connection information is insufficient.Finally,we verify the validity of the proposed model through the stock relationship network and bicycle-sharing network data.
Keywords/Search Tags:Stochastic block model, Variational inference, ICL criterion, Multi-layer network, Multi-subject network, Dynamic network, Rényi divergence
PDF Full Text Request
Related items