Font Size: a A A

Research On Source Code Vulnerability Detection Based On Deep Learning

Posted on:2022-04-21Degree:MasterType:Thesis
Country:ChinaCandidate:H J ZhangFull Text:PDF
GTID:2518306764976809Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of computer technology,the number of software shows an explosive growth.Most software development cycles are short and the development is not standardized,resulting in an upward trend in the number of software vulnerabilities.How to solve many problems such as high false negative rate,high false positive rate,heavy dependence on domain expert knowledge,coarse detection granularity,and difficulty in vulnerability location in current vulnerability detection technology has become an urgent problem to be solved.In order to solve these problems,this thesis proposes a source code vulnerability detection and localization method based on associated code property graph and graph attention network.The associated code property graph is an improved code property graph proposed in this thesis,which enables cross-functional detection by associating the corresponding code property graphs of the calling function and the called function.Based on the difference between the corresponding associated code property graphs of vulnerable and non-vulnerable code,this thesis converts the source code vulnerability detection into the graph classification problem of associated code property graphs,and constructs a vulnerability detection model DMGGAT based on graph attention network.The model uses Vec CPG vectorization method to vectorize the associated code property graph,by extracting the lexical,syntactic,and semantic information of the source code to form a feature matrix,and extracting the AST,CFG,and DDG edge information to form three adjacency matrices.The DMGGAT model contains three multi-headed GAT layers and a classification module.The DMGGAT model updates the feature vector by inputting different adjacency matrices at each multi-headed GAT layer to reduce the dimensionality,and then input to the classification module to achieve graph binary classification.Based on the attention mechanism,the average attention value of the model to the nodes during detection can be obtained.The nodes with high average attention value are useful for graph classification.Based on this idea,this thesis constructs a vulnerability location model LDMGGAT based on the attention mechanism.LDMGGAT model achieves the vulnerability location effect by obtaining the normal interval of node attention value through the attention matrix of each layer and IQR algorithm.This thesis selects CWE121,CWE122,CWE369,CWE416 and CWE476 from the Juliet dataset for vulnerability detection and localization experiments.The results of the vulnerability detection experiments show that the F1 scores of the DMGGAT model on the five CWE vulnerability types are 95.99%,90.36%,95.86%,96% and 96.62%,respectively.The results of the location detection experiments show that the LDMGGAT model has an accuracy of over 80% on all five CWE vulnerability types.All of the above experiments validate the effectiveness of the DMGGAT vulnerability detection model and the LDMGGAT vulnerability location model.Finally,this thesis implements a prototype system for source code vulnerability detection based on deep learning and conducts relevant tests.
Keywords/Search Tags:Vulnerability Detection, Code Property Graph, Graph Attention Network, Attention Mechanism
PDF Full Text Request
Related items