Network Topology Inference From Incomplete Graph Signals

Posted on:2022-07-03

Degree:Doctor

Type:Dissertation

Country:China

Candidate:X Yang

Full Text:PDF

GTID:1480306602493594

Subject:Communication and Information System

Abstract/Summary:

PDF Full Text Request

With the rapid proliferation of information technology,networks are playing more and more important roles in people's life.Due to this,massive amounts of data are generated from networks in numerous applications.Different from conventional time-domain signals,data that reside on networks often have irregular and complex structures.As a mathematical abstraction of network topology,graphs offer an ability to describe the structure of various networks,such as communication networks,transport networks and social networks.We refer to such structured data as graph signals and the emerging field of signal processing on graphs has also attracted much attention in the recent years.By modelling data residing on networks as graph signals,signal processing on graphs extends signal processing concepts and methodologies from the classical signal processing theory to data indexed by general graphs.Numerous graph signal processing(GSP)works have been introduced in the past decade for analyzing structured data on a priori known graphs.However,there are often settings where the graph is not readily available,and the structure of the data has to be estimated in order to permit effective representation,processing and analysis of graph signals.Moreover,it should be noted that many real-world applications suffer a common drawback,i.e.,missing data.The missing data problem poses a great challenge to graph topology inference,since the estimation error of missing data will severely affect the accuracy of network topology inference.Therefore,it is imperative to design graph learning approaches to handle such cases.In view of this,the dissertation focuses on two different missing data mechanisms,i.e.,situation where data are missing completely at random and situation where there are hidden nodes,and aims to develop efficient methods to infer network topologies from incomplete graph signals.The main contributions of this dissertation are summarized as follows:1.Network topology inference from homogeneous incomplete graph signals Chapter 2 of this dissertation first introduces the graph signal model we used,i.e.,Gaussian graphical model(GGM)and network topology inference methods based on GGMs.In face of the challenges brought by missing data,we investigate how to infer the network topology from homogeneous incomplete graph signals,i.e.,incomplete data from single network.Particularly,two situations with missing data are taken into account,including the situation where nodal variables are not always observed and situation where there are nodal variables which are never observed(typically referred to as hidden nodes).For the first situation with data missing completely at random(MCAR),we first adopt a computationally tractable procedure to recover the covariance matrix of data,and then cast the topology inference problem in terms of estimating the precision matrix that has a form of graph Laplacian.The experimental results demonstrate that the topology inference performance of this method can be improved with the increase of the number of samples or the decrease of the missing data ratio.However,the graph estimation accuracy declines dramatically when a large amount of data are missing or the number of samples is insufficient.For the second situation with hidden nodes,we decompose the precision matrix into a sparse matrix and a low-rank matrix so that the summary of the effect of hidden nodes can be characterized by the low-rank matrix.Then considering the Laplacian constraints on the precision matrix,the topology of sub-network consisting of observed nodes can be inferred by the regularized maximum-likelihood(ML)approach.However,we still know little about the whole network topology.2.Joint inference of multiple related network topologies from randomly-missing graph signalsFaced up with the limitation of the method described in Chapter 2,Chapter 3 of the dissertation proposes a framework that relies on heterogeneous incomplete data from a collection of related networks to identify multiple network topologies simultaneously.So that we can borrow information across the multiple related networks to eliminate the impact of missing data.Considering the Gaussian graphical model and the scenario where date are missing completely at random,we first establish an unbiased estimator for the covariance matrices and then develop algorithms based on the alternating direction method of multipliers(ADMM)to jointly estimate graph topologies,rather than estimate each graph topology separately.Moreover,non-asymptotic statistical analysis is provided,which proves the consistency of the graph estimator and enables us to investigate the effect of several key factors on the graph estimation error bound,such as the ratio of missing data,the number of graph signals and the number of multiple related networks.Furthermore,based on the consistent graph estimator,an adaptive algorithm that utilizes the reweighting scheme is proposed to improve the estimation accuracy of the graph-edge structure.Finally,we evaluate our method on both real and synthetic datasets,and the experimental results demonstrate the advantage of our method in comparison with benchmarking algorithms.3.Dynamic topology inference for networks with hidden nodesIn Chapter 4 of this dissertation,we propose a two-step topology inference method for networks with hidden nodes.The framework relies on heterogeneous incomplete data from time-varying networks and can not only identify the topologies of sub-networks consisting of observed nodes,but also infer the whole topologies of networks.For each graphical model,we assume that a set of nodes are unobserved,i.e.,are hidden nodes,and the set of hidden nodes may be different across different times,which means that we can still exploit the side information from related networks to mitigate the effect of hidden nodes.Based on the assumption,we first infer the observed network topologies at different times jointly,and then infer the complete network topologies by utilizing the similarity among them.Finally,the experimental results demonstrate the superiority of the proposed two-step method in case of hidden nodes.Moreover,it is shown that the topology inference performance of this method can be improved when the number of samples increases or the number of hidden nodes decreases.

Keywords/Search Tags:

graph signal processing, network topology inference, missing data, heterogeneous data

PDF Full Text Request

Related items

1	Statistical Inference For Longitudinal Data With Missing Data Based On The Conditional QIF
2	The Statistical Inference Of Zero-inflated Negative Binomial Regression Model With Missing Data
3	The Data Analysis Of GEE And QIF Used In The Asthma Data With Missing Data
4	Collapsibility Of Variables And Statistical Inference Of Missing Data
5	EM Algorithm For Binary Markov Chains Of Longitudinal Data With Missing Data
6	Graph Signal Processing On Complex Network
7	Statistical Inference Of High-dimensional Complex Data Networks And Node Attributes
8	Statistical Inference Of Semiparametric Regression Models With Nonignorable Missing Data
9	A Study Of The Navigable Data Producing System And Its Key Techniques
10	Statistical Inference For Estimating Equations Statistical Models With Missing Data