| In the era of big data,there exists a pervasive interdependence among data,such as interactions between users in social media,mutual citations among papers in scientific research,and the interplay between proteins in biological systems.This ubiquitous interdependence forms the network of big data in various domains.Graph-structured data,an efficient and prevalent form of data representation,effectively captures the correlation characteristics among data,making it especially suitable in the network of big data.As a type of non-Euclidean structural data,the complex interdependencies of graph data impose higher demands on existing deep learning models and machine learning algorithms,thereby catalyzing the evolution of graph representation learning methodologies.These methodologies provide us with more effective tools to better understand and analyze these complex relational data.However,faced with the explosive growth of graph scale and the intricate structures of graphs,traditional graph representation learning methodologies often exhibit limitations due to the constraints in computational complexity and scalability.In this context,contrastive learning,as a powerful self-supervised learning method,has brought new perspectives to graph representation learning.By ingeniously designing contrastive tasks,contrastive learning can extract rich node relationship information from graph data and provide more discriminative and generalizable representations for graph representation learning.Currently,contrastive learning has made significant progress in the field of graph representation learning.However,this field continues to encounter enormous challenges,such as designing the tasks of contrastive learning in different scenarios,addressing the issue of data class imbalance,and enhancing the adaptability of models across various domains.This article offers an in-depth examination of graph representation learning methods based on contrastive learning in scenarios ranging from semi-supervised to unsupervised and class imbalanced.The main research content and contributions of this paper are as follows:(1)In the scenario of semi-supervised learning,a label-guided graph contrastive learning framework is proposed,aiming to explore semantic-level feature similarity among nodes.Specifically,leveraging the model predictions containing semantic information during training to guide the sampling process in contrastive learning.Due to the uncertainty in the prediction accuracy of unlabeled nodes,there is an inevitable sampling of some erroneous positive nodes.To tackle this issue,a self-checking mechanism based on deep clustering is introduced to ensure the reliability of sampled positive nodes.This mechanism clusters node embeddings,considering only those node pairs with consistency in clustering assignments and pseudo-labels as positive pairs,thereby enhancing the robustness of the method.Additionally,a reweighting strategy based on the probability distribution of anchor nodes is designed to amplify the impact of hard negative nodes.This strategy selectively chooses negative nodes,especially hard negative nodes,further improving the performance of contrastive learning.Experimental results on various graph benchmarks demonstrate the significant superiority of the label-guided graph contrastive learning algorithm.(2)In the scenario of unsupervised learning,a graph prototypical contrastive learning(GPCL)framework is proposed to upgrade the modeling of feature similarity from the instance level to the prototype level.By establishing a theoretical framework,GPCL is framed as an online expectation-maximization algorithm.The framework iteratively performs online clustering and graph prototypical contrastive learning,gradually discovering and refining the underlying semantic structure of the data.Specifically,a graph prototypical contrastive loss function is introduced,which includes an instance-level contrastive loss to model instance-level feature similarity,and a prototype-level contrastive loss to model prototype-level feature similarity.The prototype-level contrastive loss includes two contrastive objectives,namely the representation consistency contrastive objective and the clustering consistency contrastive objective.Among them,the former is used to learn representations with intra-class invariance and inter-class discriminability,and the latter is used to learn similar cluster assignments between instances and their augmentations.GPCL is evaluated by pre-training on large-scale unlabeled data and then fine-tuning on downstream tasks.The experimental results verify the superiority of GPCL.(3)In the scenario of class imbalance,a prototypical graph contrastive learning with structure-aware hard negative mining(PROCONE)method is proposed.By introducing a prototype contrastive loss,minor classes are encouraged to form discriminative distributions from major classes.This method implicitly encodes the semantic structure of the data in the learned balanced feature space,forming clear decision boundaries.In addition,since minor nodes are easily misclassified into other classes with more edge connections to them,a reweighting mechanism is proposed to reweight negative nodes based on their local and global structural characteristics.The efficacy of PROCONE is corroborated by competitive results on various well-known class-imbalanced graph datasets. |