Optimization Methods For Structure Identification In Complex Networks

Posted on:2022-06-20

Degree:Doctor

Type:Dissertation

Country:China

Candidate:Y G Bai

Full Text:PDF

GTID:1480306605989069

Subject:Applied Mathematics

Abstract/Summary:

PDF Full Text Request

Network science research has progressed rapidly and become an important research tool in sociology,computer science,biology and other multidisciplinary fields.The key to understanding the network is to dig deeper into its intrinsic topology.In the era of big data,how to accurately and efficiently extract network structural features from massive connection relationships is an NP-hard problem.Effective solutions with a theoretical basis are urgently needed in multiple fields.Based on this,this thesis focuses on the check-in node coverage problem and community partitioning problem in complex networks.An in-depth study of these two structure identification problems can provide a theoretical basis and methodological support for many network characteristics studies.For the check-in node coverage problem,we propose the greedy algorithms which are excellent in theoretical lower bounds and practical calculations,based on the sub-modular optimization theory with the improved iterative selection mechanism and indicators.For the community partitioning problem,we propose the semi-supervised global convergence efficient algorithms according to the max-flow model,based on the prime-dual theory with the improved benchmark node selection and algorithm convergence mechanism.This thesis investigates the problem of check-in node identification under cost factors and gives efficient greedy algorithms for two cases using sub-modular theory.This thesis proposes a cost-based check-in node coverage problem from the network's discrete characteristics,focusing on two critical factors:1)minimizing the cost problem under full coverage;2)maximizing the coverage problem under the cost constraint.For the minimization cost problem:we give three 'bad' cases that are very likely to arise as the traditional greedy algorithm solves the network coverage:'one-to-many','many-to-one',and the coverage redundancy problem.We propose a new index for contribution density to address this finding and introduce an efficient two-stage backtracking mechanism in the new algorithm to overcome the redundancy phenomenon.Through theoretical demonstration,the new metric solves the first two'bad'cases while maintaining the greedy algorithm's theoretical solution.In several artificial and real-world networks,it is found that the proposed contribution density-based greedy algorithm and the contribution density-based backtracking greedy algorithm significantly improve the solution quality.The new backtracking extraction mechanism can effectively screen out a large number of redundant nodes,which is of profound significance for practical computation.For the maximize coverage problem:this thesis innovates both theoretical lower bounds for solving the problem and practical computations.To enhance the theoretical lower bound of results,we propose a hybrid mechanism based on two strategies of coverage priority and contribution density priority and prove the selection advantage of K?3 coverage priority through sub-modular theory.To enhance the utilization problem of traditional greedy algorithms for cost,a multi-stage strategy is proposed to fill the cost gap.It is theoretically demonstrated that the hybrid greedy algorithm proposed in this thesis improves the lower bound of the algorithm approximation solution from 1/2(1-1/e)to(1-1/e)compared with the state-of-the-art results.In addition,considering the difference between the theoretical lower bound and the actual computation,we also deeply analyze the actual computational effect of the algorithm and find that the correlation between r and |S| has a profound impact on the performance of the hybrid greedy algorithm.Using Pearson correlation analysis,we give the adaptive range of the hybrid greedy algorithm,i.e.,when the correlation is close to 1,the hybrid algorithm is theoretically superior and superior in the practical computational effect at the same time.This thesis also studies the network partitioning problem,proposes a convex model based on semi-supervised network partitioning,gives the two-stage optimization algorithm and the corresponding accelerated optimization algorithm.Starting from the network's clustering characteristics,we focus on two critical factors in network partitioning:1)How to obtain the global optimal solution and improve the accuracy of network partitioning by expanding the number of benchmark nodes;2)how to improve the convergence of the network partitioning algorithm with the optimization perspective.For the network partitioning accuracy problem:this thesis proposes a convex optimization model based on the minimum cut theory and the centrality attribute matrix.To solve the L1 parametric derivative computation problem in the model,we propose an ALM solution algorithm based on the primer-dual,which skillfully avoids the processing of L1 parametric terms.In order to improve the accuracy of network partitioning,this thesis introduces a two-state solution framework TSOS,which can effectively expand the number of benchmark nodes by deeply mining the structural information inherent in the network and using the new 'confidence' index.Experiments show that the two-stage solving framework TSOS expands the benchmark information to a certain extent and effectively improves the network classification accuracy.For the algorithms' acceleration problem:this thesis cleverly adopts KL distance to replace the original regular term and gives a network partitioning model based on KL regular term.Also,we propose an ALM based on KL distance,where the new algorithm reduces the number of unknown variable updates in each iteration and the variable mapping calculation operation,which significantly reduces the calculation amount of each iteration.Besides,in order to accelerate the convergence of the algorithm,a second-order convergent network partitioning algorithm based on the Bregman-FISTA form is presented in this work,using the new results in Mathematical Programming.Experiments show that the KL based accelerated algorithm and the FISTA based accelerated algorithm have an essential improvement in convergence speed as well as partition accuracy.This is a significant contribution to the solution of large-scale network partitioning problems.

Keywords/Search Tags:

Check-in node coverage, Network partition, Greedy algorithm, Convex optimization model, Semi-supervised learning, First-order accelerate algorithm, KL distance

PDF Full Text Request

Related items

1	Neural Network Algorithm For Node Classification
2	Research For Community Detection Algorithm Based On Semi-supervised Learning
3	Research On Bayesian Network Structure Learning Algorithm Based On Node Chunk Sequence
4	Research On Semi-supervised Graph Node Classification Method Fused With Local Neighborhood Information
5	Research On Semi-supervised Graph Neural Network Method Based On Graph Fusion
6	Research On Node Influence Measurement And Influence Maximization Based On Topological Structure Of Networks
7	Research On Human Abdominal Fat Area Prediction Model Based On Semi Supervised Learning
8	RNA Secondary Structure Shape-base Pair Distance And Semi-supervised Clustering Algorithm
9	Research On SSVEP Classification Algorithm Based On Semi-supervised Ladder Network
10	Community Detection Algorithm Based On Deep Learning