Font Size: a A A

Research On Semi-supervised Learning Methods For Attributed Network

Posted on:2019-04-06Degree:DoctorType:Dissertation
Country:ChinaCandidate:S K WangFull Text:PDF
GTID:1368330566998790Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of technologies in the last two decades,the availability of network data and the number and sophistication of network analysis techniques have gained rapid growth.Previous network mining approaches usually treat networks as objects of pure topology,unadorned sets of nodes and their interactions.However,this is clearly a strong limitation in many real-world applications.Most information networks are accompanied by features or attributes that describe properties of nodes.For example,in Weibo system,users are linked to each other by follower relationship;meanwhile they express their opinions by posts,as a type of textual attributes.The attributes and links contain useful information to reveal users' categories,e.g.,interests.An effective model for network data should thus consider both the attributes of the nodes and the relational structure in the processing of network mining.We refer to networks of such type as attributed network.Attributed network mining is very meaningful,which may be able to help people discovery the valuable information hidden in complex systems.Studies of social networks,for instance,might predict users' interests and can be useful for personalized recommendation for users;Studies of protein interaction networks can predict protein functions.In many real-world applications,acquiring sufficient amount of labeled data is usually very expensive or not possible,thus learning with sparsely labeled data is a desire.When the labeled data is sparse,a wise solution is adopting semi-supervised learning scheme to exploit unlabeled data.In the literature,however,there are rarely research results and mining technologies regarding to semi-supervised learning problems for attributed network.Therefore this dissertation focuses on the semi-supervised learning problems for attributed network.We mainly consider attributed network semi-supervised classification,multi-attribute and higher-order relational learning,semi-supervised multi-label classification and network node alignment across attributed networks.A series of algorithms are proposed to solve these problems.The main contributions of this dissertation are as follows:First,MARL algorithm is proposed for solving the problem of attributed network semi-supervised classification.This algorithm is capable of utilizing both the multiple attribute and single relational information;meanwhile the unlabeled data can be exploited for prediction.An effective Expectation Maximization algorithm is derived to compute the label probability distribution and generate the class label for a given instance.Experiment results on various datasets demonstrate the superiority of the proposed MARL algorithm over the compared algorithms.In addition,the results also show that MARL can deliver robust performance as the number of labeled data varies.MGGM algorithm is proposed for solving the problem of multi-attribute multi-relational network learning.A multiple networks ensemble regularization framework is proposed to combine the multiple relational networks.The solutions are derived based on the Expectation Maximization optimization framework.Experiments results on the real-world datasets demonstrate the effectiveness of MGGM algorithm.Second,HRGM algorithm is proposed for solving the problem of multi-attribute and higher-order relational learning.The algorithm fuses multi-view attributes and higherorder relational information by a hypergraph regularized generative model.To exploit the higher-order relational and unlabeled data,a hypergraph regularizer is proposed.On the one hand,the hypergraph regularizer is able to model higher-order information among linked instances.On the other hand,it can make use of unlabeled data to propagate label information by means of semi-supervised learning.Experimental results on various datasets have revealed that our approach outperforms the compared collective classification methods and multi-view classification methods.The results also illustrates the advantage of the proposed HRGM method when there is an extremely small portion of labeled data.It indicates that HRGM is very effective and robust.Then,GMHR algorithm is proposed for solving the problem of semi-supervised multi-label classification.The proposed GMHR algorithm takes all heterogeneous information into consideration,including attribute features,interaction networks,label dependency,and unlabeled data.Two hypergraph regularizers are incorporated into GMHR.A instances network hypergraph is constructed to exploit the higher-order interactions among instances.Another labels correlation hypergraph is built to capture the higher-order dependency among labels.An iterative algorithm is introduced to find the solutions of our model.Experimental results on the real-world gene datasets predicting the functions and localization of proteins demonstrate the superiority of our proposed method compared with the state-of-the-art baselines.Finally,UANE algorithm is proposed for solving the problem of network node alignment across attributed networks.For multiple attributed networks,UANE algorithm employs network embedding to map attributed networks independently into low dimension space to learn effective representation of nodes.The network embedding preserves specific structural regularities of attributed networks.Nodes that have similar structural regularities and attributes will have similar embedding vectors.The algorithm takes the lowdimensional latent vectors of nodes as features and feeds them to a sigmoid layer when making the prediction under the supervision of observed anchor links.The embedding of a node is jointly trained to predict the anchor links and the context in the network.The effectiveness of the proposed model was evaluated on three realistic datasets.
Keywords/Search Tags:attributed network, semi-supervised learning, multi-label classification, node alignment
PDF Full Text Request
Related items