Font Size: a A A

Research And Implementation Of Graph Representation Learning Algorithm Based On Automated Machine Learning

Posted on:2023-11-25Degree:MasterType:Thesis
Country:ChinaCandidate:J W SunFull Text:PDF
GTID:2530306914477434Subject:Computer technology
Abstract/Summary:
Many things have a natural graph structure in the real world,and the study of graph data has a long history.With the advent of the digital era and the rapid development of the Internet,the volume of graph data has exploded,making us in urgent need of tools that can efficiently model graph-structured data to solve a series of important problems related to graphs in the real world.With the continuous development of machine learning,machine learning on graphs has gradually become the focus of academic and industrial research.In particular,as the basis of graph machine learning,graph representation learning has attracted significant attention in the recent ten years.Although researchers have designed a lot of graph representation learning algorithms and achieved remarkable performance in the downstream tasks,these algorithms generally need manually tuning on the hyper-parameters and model structure to get the best performance.This process usually requires a lot of human efforts,material resources,and professional knowledge in this field,which remains to be improved.In recent years,extensive research into automated machine learning has made it possible to solve this problem.Automated machine learning builds a machine learning pipeline in a data-driven way,allowing a user to acquire the best model for a specific data and task.In fields such as computer vision and natural language processing,models acquired by automated machine learning have already rival or exceeded those designed by humans.This thesis aims to design a graph representation learning algorithm for specific graph data automatically with the help of the idea of automated machine learning,which can acquire informative node representation for different types and scales of graph data to achieve remarkable performance in downstream tasks.The major work of this thesis includes the following three points:Firstly,automated machine learning on homogeneous graphs is studied,and an automated graph representation learning algorithm,AutoGRL,is proposed,consisting of two parts:search space and optimizer.In the search space design part,a search space containing critical components of automated machine learning such as data augmentation,feature engineering,hyper-parameter optimization,and neural architecture search is proposed.In the optimizer design part,a decision tree is used to search for the best graph representation learning algorithm for specific graph data in the search space.In addition,a pruning strategy is used to speed up the search process.Contrast experiments and ablation experiments on downstream tasks such as node classification and link prediction prove that AutoGRL can effectively and efficiently process homogeneous graphs.Secondly,AutoGRL is extended to AutoGRLv2 to handle both heterogeneous and large-scale graph data.When dealing with heterogeneous graphs,the search space of AutoGRL is extended.When dealing with large-scale graphs,the optimizer of AutoGRL is improved and extended.Contrast experiments and ablation experiments on heterogeneous and large-scale graph datasets demonstrate that these extensions enable AutoGRLv2 to process heterogeneous and large-scale graphs successfully.Thirdly,an automated graph representation learning system is designed and developed to integrate the above algorithms.The system includes all modules needed for the implementation of graph representation learning algorithms and provides a one-stop graph representation learning solution for graph data researchers.
Keywords/Search Tags:graph representation learning, automated machine learning, node classification, link prediction
Related items