Font Size: a A A

Research On Complex Network Data Analysis And Mining Based On Information Entropy

Posted on:2022-09-07Degree:DoctorType:Dissertation
Country:ChinaCandidate:J F MuFull Text:PDF
GTID:1480306509966429Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As an emerging interdisciplinary research field,complex network abstracts various complex systems into topological structures that only retain connection patterns.Studying the influence of topological structure on system function is one of the hotspots in complex network analysis,for example,exploring the impact of topological structure of social networks on the speed of information and disease spreading,and the impact of topological structure of power networks on the robustness.With the increase of network size,the influence of topological structure on system function becomes more uncertain.Information entropy measures the uncertainty of related information between data and provides high quality information for effective decision-making.Applying information entropy to measure uncertainty of network structure information is not only of great significance for network structure features and system function prediction,but also plays an active role in the fields of disease analysis and drug design.This thesis focuses on analyzing and mining the topological structure and behavior function of complex networks based on information entropy from four aspects: node similarity measurement,generation graph model,structural robustness analysis and disease subnetwork analysis.Specifically,the main research results are as follows:To address the volume of complex networks,a node similarity measurement based on distance distribution and relative entropy is proposed,calculating the structural similarity between nodes.In this measurement,the information of shortest path between nodes is introduced,and the topological structure of node is represent as distance distribution.The similarity of topological structure between nodes is obtained by calculating the difference of distance distribution through relative entropy.Using relative entropy can eliminate the uncertainty of topological information contained in the distance distribution,which effectively alleviate the inaccuracy of the classical node similarity measurements.The experimental results show that the metric can calculate the structural similarity between nodes in complex networks more accurately.Considering the variety of complex networks,a generation graph model based on edge rewiring strategy and relative entropy is proposed,generating a network with adjustable clustering coefficient and average path length.In our model,the initial network with different connect pattern is built based on Havel-Hikimi algorithm and configuration model.The connection information between node and communities is introduced,and the node is represent as edges distribution.The relative entropy is used to calculate the difference of edges distribution between nodes,and then obtain the participation coefficient of the edge to communities.Based on the participation coefficient of the edge,two edges that have a greater impact on the network topology are selected,and the local effect function of clustering coefficient and average path length are calculated to determine the rewiring operation.Using relative entropy can eliminate the uncertainty of topological information contained in the edges distribution,which effectively alleviate the computational cost in the iterative process and the problem of local extremum.The experimental results indicate that this model can quickly adjust clustering coefficient and average path length,so that networks show diversity.Aiming at the velocity of complex networks,a structural robustness adjusting algorithm based on rewiring mechanism and Shannon entropy is proposed,adjusting the structural robustness of the network.In this algorithm,the edge weight is defined by the degree of nodes and an edge rule is introduced based on roulette.Shannon entropy based on eigenvalues distribution of the adjacency matrix is used to measure structural robustness of network,and the optimal rewiring operation can be determined by the connectivity and Shannon entropy.Using Shannon entropy can eliminate the uncertainty of topology information contained in the eigenvalues distribution,which accurately measure the structural robustness of the network.The experimental results show that this algorithm can quickly to adjust topological structure to improve the structural robustness of networks.For the veracity of complex networks,an extracting disease-related subnetwork algorithm based on structure entropy minimization principle is proposed,constructing a subnetwork with high relevant with special disease.The algorithm construct the initial subnetwork by known protein set.The principle of minimizing structure entropy is introduced from the perspective of protein complexes,which realizes the division of subnetworks and the addition of candidate nodes in subsequent iterations.Through exploring the qualitative relationship between increase of structure entropy and variance of the edge number sequence containing numbers of edges connecting added node and existing modules,the problem of minimizing structure entropy can be replaced with the problem of maximizing variance of edge number sequence,which improving calculation efficiency.The use of structure entropy can eliminate the uncertainty contained in the edges information between node and modules,so as to accurately construct subnetworks that are highly correlated with diseases.The above algorithm is used to extract the depression disease subnetwork from the human protein interaction network,which can predict the key proteins,functional modules and pathways related to the disease more accurately.Based on the characteristics of complex networks,this paper combines node similarity measurement,generation graph model,structural robustness adjusting and disease subnetwork analysis to further explore the topological structure and behavior function of complex networks.The research results in this paper enrich the relationship between topological structure measurement and behavior function of complex networks,and provide support for future research complex network application.
Keywords/Search Tags:Complex Networks, Topological Structure, Information Entropy, Node Similarity, Generation Graph Model, Structural Robustness, Disease Subnetwork
PDF Full Text Request
Related items