Font Size: a A A

Optimization Methods For Structure Identification In Complex Networks

Posted on:2022-06-20Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y G BaiFull Text:PDF
GTID:1480306605989069Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
Network science research has progressed rapidly and become an important research tool in sociology,computer science,biology and other multidisciplinary fields.The key to understanding the network is to dig deeper into its intrinsic topology.In the era of big data,how to accurately and efficiently extract network structural features from massive connection relationships is an NP-hard problem.Effective solutions with a theoretical basis are urgently needed in multiple fields.Based on this,this thesis focuses on the check-in node coverage problem and community partitioning problem in complex networks.An in-depth study of these two structure identification problems can provide a theoretical basis and methodological support for many network characteristics studies.For the check-in node coverage problem,we propose the greedy algorithms which are excellent in theoretical lower bounds and practical calculations,based on the sub-modular optimization theory with the improved iterative selection mechanism and indicators.For the community partitioning problem,we propose the semi-supervised global convergence efficient algorithms according to the max-flow model,based on the prime-dual theory with the improved benchmark node selection and algorithm convergence mechanism.This thesis investigates the problem of check-in node identification under cost factors and gives efficient greedy algorithms for two cases using sub-modular theory.This thesis proposes a cost-based check-in node coverage problem from the network's discrete characteristics,focusing on two critical factors:1)minimizing the cost problem under full coverage;2)maximizing the coverage problem under the cost constraint.For the minimization cost problem:we give three 'bad' cases that are very likely to arise as the traditional greedy algorithm solves the network coverage:'one-to-many','many-to-one',and the coverage redundancy problem.We propose a new index for contribution density to address this finding and introduce an efficient two-stage backtracking mechanism in the new algorithm to overcome the redundancy phenomenon.Through theoretical demonstration,the new metric solves the first two'bad'cases while maintaining the greedy algorithm's theoretical solution.In several artificial and real-world networks,it is found that the proposed contribution density-based greedy algorithm and the contribution density-based backtracking greedy algorithm significantly improve the solution quality.The new backtracking extraction mechanism can effectively screen out a large number of redundant nodes,which is of profound significance for practical computation.For the maximize coverage problem:this thesis innovates both theoretical lower bounds for solving the problem and practical computations.To enhance the theoretical lower bound of results,we propose a hybrid mechanism based on two strategies of coverage priority and contribution density priority and prove the selection advantage of K?3 coverage priority through sub-modular theory.To enhance the utilization problem of traditional greedy algorithms for cost,a multi-stage strategy is proposed to fill the cost gap.It is theoretically demonstrated that the hybrid greedy algorithm proposed in this thesis improves the lower bound of the algorithm approximation solution from 1/2(1-1/e)to(1-1/e)compared with the state-of-the-art results.In addition,considering the difference between the theoretical lower bound and the actual computation,we also deeply analyze the actual computational effect of the algorithm and find that the correlation between r and |S| has a profound impact on the performance of the hybrid greedy algorithm.Using Pearson correlation analysis,we give the adaptive range of the hybrid greedy algorithm,i.e.,when the correlation is close to 1,the hybrid algorithm is theoretically superior and superior in the practical computational effect at the same time.This thesis also studies the network partitioning problem,proposes a convex model based on semi-supervised network partitioning,gives the two-stage optimization algorithm and the corresponding accelerated optimization algorithm.Starting from the network's clustering characteristics,we focus on two critical factors in network partitioning:1)How to obtain the global optimal solution and improve the accuracy of network partitioning by expanding the number of benchmark nodes;2)how to improve the convergence of the network partitioning algorithm with the optimization perspective.For the network partitioning accuracy problem:this thesis proposes a convex optimization model based on the minimum cut theory and the centrality attribute matrix.To solve the L1 parametric derivative computation problem in the model,we propose an ALM solution algorithm based on the primer-dual,which skillfully avoids the processing of L1 parametric terms.In order to improve the accuracy of network partitioning,this thesis introduces a two-state solution framework TSOS,which can effectively expand the number of benchmark nodes by deeply mining the structural information inherent in the network and using the new 'confidence' index.Experiments show that the two-stage solving framework TSOS expands the benchmark information to a certain extent and effectively improves the network classification accuracy.For the algorithms' acceleration problem:this thesis cleverly adopts KL distance to replace the original regular term and gives a network partitioning model based on KL regular term.Also,we propose an ALM based on KL distance,where the new algorithm reduces the number of unknown variable updates in each iteration and the variable mapping calculation operation,which significantly reduces the calculation amount of each iteration.Besides,in order to accelerate the convergence of the algorithm,a second-order convergent network partitioning algorithm based on the Bregman-FISTA form is presented in this work,using the new results in Mathematical Programming.Experiments show that the KL based accelerated algorithm and the FISTA based accelerated algorithm have an essential improvement in convergence speed as well as partition accuracy.This is a significant contribution to the solution of large-scale network partitioning problems.
Keywords/Search Tags:Check-in node coverage, Network partition, Greedy algorithm, Convex optimization model, Semi-supervised learning, First-order accelerate algorithm, KL distance
PDF Full Text Request
Related items