| Graph neural networks are effective tools for modeling graph data.However,GNNs struggle to learn global graph structures and node representations from a small number of labeled nodes when performing node classification tasks.To address this issue,contrastive learning has been introduced to GNNs,which learns node representations by comparing node similarities and provides additional supervision to the model.However,existing contrastive learning methods for GNNs are typically trained in an unsupervised manner and employ complex model structures and data augmentation techniques.To this end,this paper proposes a semi-supervised learning algorithm called Contrastive Learning with PseudoLabels(CLPL)on graphs.Based on the idea of contrastive learning,this algorithm introduces pseudo label algorithms in semi-supervised learning to use the designed proxy task of pseudo-labels to guide the representation learning of unsupervised data,thereby addressing the challenge of insufficient supervision in graph semi-supervised learning.The main work of this paper is as follows:1.This paper proposes a graph semi-supervised learning algorithm called CLPL that utilizes pseudo-labels for contrastive learning.Inspired by contrastive learning,this paper adds a semi-supervised contrastive learning loss to the graph neural network.The purpose of this loss is to maximize the mutual information of the representations of nodes with the same label to improve node representations,improving the performance on the node classification task.For unlabeled nodes,the calculation of losses depends on the predictions of unlabeled nodes.In order to improve the quality of the predictions,This paper performs K-fold data augmentation on the input features.The obtained K new feature matrices are used as the model’s K inputs for prediction,generating K predictions for each unlabeled node.Next,take the average of K predictions and perform sharpening operations to obtain the final prediction for each unlabeled node,which is recorded as the prediction target.Besides,in order to enhance the model’s generalization capability,this study constrains the model’s predicted results to be consistent with the prediction targets.The cross-entropy loss between the predicted results and the prediction targets is incorporated as a regularization term in the loss function.2.To ensure that the model’s learning on unlabeled data has a positive impact on improving the performance of node classification tasks and to avoid overfitting or underfitting,this paper proposes three methods for calculating the weight of the contrastive loss,as variants of CLPL,all of them show good performance on node classification tasks.In order to compare and analyze these variants,this paper conducted quantitative analysis and timing experiments on them.Ablation experiments are also made to analyze the performance of the various components of the model.3.CLPL is not only a semi-supervised learning algorithm on graphs for node classification task,but also serves as a regularization framework independent of specific graph neural network models,and thus applied to various graph neural networks.Specifically,the consistency loss and contrastive loss proposed in this paper can be took as regularization terms to stably improve the node classification performance and generalization performance of the model.To validate the effectiveness of the proposed algorithm as a regularization framework in improving model performance,five classical baseline models were chosen in this paper and combined with the proposed algorithm.Node classification experiments were conducted on six real-world datasets. |