Graph-structured data widely exists in different scenarios,such as social networks,academic networks,and e-commerce scenarios.Graph machine learning algorithms can model,analyze and leverage complex graph structure information.However,in realworld applications,graph data has different characteristics such as label scarcity,heterogeneity,and temporality,which brings unique challenges to the design of graph machine learning algorithms.In addition,with the rapid growth of graph data scale,graph machine learning algorithms also face computational efficiency problems.We first propose eTrust,a factor graph model tackling label scarcity.This method can well integrate unlabeled graph structure information into the training of the model,and has achieved good results in the relationship prediction task.In order to extend the eTrust model to large-scale graph data,we analyze the performance bottleneck of the eTrust model,design a simplification strategy,and propose a more efficient eTrust-s model,which is more than 1000 times more efficient than the eTrust model.To enhance the performance of graph machine learning models in heterogeneous scenarios,we propose the GATNE model.The model leverages different types of graph structure information in heterogeneous graphs,and then fuses them through the attention mechanism.We use a heterogeneous random walk strategy to train the GATNE model and obtain the representation of each node under different types of graph structure information.The GATNE model can handle both transductive and inductive embedding learning tasks.Considering the sequential nature of the construction of graph data in practical applications,we propose the ComiRec model to further improve the modeling ability of graph machine learning models for real graph data.This model can comprehensively leverage the user’s temporal interaction behaviors,and compute multiple node representation vectors for each user.In addition,the ComiRec model can also control the prediction accuracy and diversity of the model through an adjustable hyperparameter.In addition to address challenges at the algorithm level,we also propose a comprehensive graph machine learning framework,CogDL,to tackle the issues of graph machine learning such as sparse computational efficiency.For the sparse nature of graph-structured data,CogDL uses dedicated optimized sparse operators to improve the efficiency of parallel computing,and can achieve significant performance improvements on different types of graphics cards.CogDL unifies the training process of graph machine learning models,and further provides different training optimization techniques,including mixed precision training and activation compressed training.Finally,we take the Alibaba recommender system and AMiner as examples to introduce how to apply the CogDL framework and algorithms proposed in this paper. |