| Code vulnerabilities refer to a coding defect in the software source code,usually caused by errors in the software’s design,development,or configuration.With the rapid development of the Internet,code vulnerabilities have become increasingly complex and diverse.These vulnerabilities pose a hidden danger of malicious use of the software,which brings serious challenges to the security and reliable use of the software.Therefore,efficient and rapid detection of code vulnerabilities has become an increasingly urgent need.The rapid development and widespread application of deep learning technology have provided new research approaches and ideas for many fields,and code vulnerability detection based on deep learning is also becoming a popular research direction.Existing deep learning-based code vulnerability detection approaches use source code vulnerability data to generate representation matrices and vectors,and then use these matrices and vectors to detect code vulnerabilities.Although these approaches have achieved certain results,there are still some shortcomings: for example,most code vulnerability detection approaches require data labels during training,and in the increasingly complex Internet environment,it is difficult to obtain labeled source code data.In addition,some approaches abstract code into graph data and then use graph neural network encoders for vulnerability detection.However,these encoders do not make full use of the information in the graph and fail to achieve sufficient and effective interaction and fusion between local information(node representation)and global information(graph representation),which limits the improvement of detection performance.To address the above issues,this paper proposes an unsupervised function-level code vulnerability detection approach VDGCL and a statement-level code vulnerability detection approach VDBGRL based on graph contrastive learning,exploring the breadth and precision of vulnerability detection respectively.VDGCL can detect various code vulnerabilities while locating the function where the vulnerability occurs,and VDBGRL can detect statements related to the vulnerability and rank them according to their relevance to the vulnerability.VDGCL and VDBGRL make full use of the logical relationships in the code and achieve sufficient and effective interaction and fusion between local information(node representation)and global information(graph representation).They can be trained without data labels,making them more widely applicable than supervised vulnerability detection approaches.Experimental results show that VDGCL and VDBGRL perform comparably to existing code vulnerability detection approaches in function-level and statement-level detection tasks,and even outperform existing approaches on some metrics.This proves the effectiveness of graph contrastive learning in function-level and statement-level code vulnerability detection.The main contributions of this paper are as follows:(1)A function-level code vulnerability detection approach based on graph contrastive learning VDGCL.This article proposes the VDGCL approach,which aims to detect the functions where code vulnerabilities exist through unsupervised learning without the need for data labels.The VDGCL approach can be divided into two stages: graph construction and detection model training and testing.In the graph construction stage,Joern is used to build an abstract syntax tree and add multiple edges to form a graph.Word2 vec is then used to generate node representations in the graph.In the detection model training and testing stage,this article innovatively introduces the idea of graph contrastive learning for code vulnerability detection.VDGCL inputs the constructed graph into the detection model,then maximizes the consistency between node representations and graph representations among different views,and finally outputs the functions with vulnerabilities and the types of vulnerabilities.Experiments have verified the effectiveness of graph contrastive learning in function-level code vulnerability detection.(2)A statement-level code vulnerability detection approach based on graph contrastive learning VDBGRL.This article proposes the VDBGRL approach for de-tecting statements related to vulnerabilities and sorting them by relevance.The VDBGRL approach includes two stages: generating a code attribute graph and training and evaluating a detection model.In the code attribute graph generation stage,VDBGRL uses joern to construct a code attribute graph,generates function representations and statement representations for each function and statement,and then merges the function representations into the statement representations.In the training and evaluation stage of the detection model,the subgraphs of the code attribute graph are input into the detection module,and the node representations of the two views are maximized,and the detection model is trained through unsupervised learning.Finally,VDBGRL outputs statements in the source code that may be related to vulnerabilities and sorts them by relevance.The experiments demonstrate the effectiveness of graph contrastive learning in statement-level code vulnerability detection.(3)This article builds an efficient code vulnerability detection system based on graph contrastive learning,using VDGCL and VDBGRL as the foundation.The system has multiple functions,including file uploading,selection of graph type and detection approach,real-time tracking of the detection progress,and display of detection results.In addition,we provide a user-friendly web interface that allows users to use the system more conveniently. |