Font Size: a A A

Research On Software Vulnerability Detection Method Based On Code Property Graph And Graph Convolutional Neural Network

Posted on:2021-02-21Degree:MasterType:Thesis
Country:ChinaCandidate:Y N DuanFull Text:PDF
GTID:2428330614950008Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the widespread use of open source software,software vulnerabilities are some defects or deficiencies that are easily used by malicious attackers in the process of software design and development.As the first layer of software security inspection,early detection of vulnerabilities from source code and timely patching can reduce the losses caused by software vulnerabilities.However,the traditional source code review technology depends largely on the reviewers' understanding and accumulation of long-term experience,besides in the case of increasing code size can not meet the needs of vulnerability detection.And the vulnerability detection method based on machine learning avoids the problem that the rule-based vulnerability detection method relies on experts to manually write detection rules,but it still needs to manually extract the vulnerability characteristics.In recent years,the research provided by deep learning technology in various fields provides a new direction for vulnerability detection.However,the existing research methods often ignore the structural information in the intermediate representation of the source code,and the deep neural network model is roughly similar to the model used in natural language problem.In response to the above problems,this paper proposes a vulnerability detection method based on code property graph and graph convolutional neural network.First,we analyze the source code to generate the corresponding Code Property Graph,and extract the desired graph structure related to the vulnerability according to the type of graph structure edge.Further,we use program slicing technology to generate program slices based on the specified vulnerability key points to simplify the graph structure,and only retain the graph structure related to the specified vulnerability key points,which can not only reduce the noise interference of unrelated branches on vulnerability detection,but also speed up Learning speed of the model.Then we abstract the extracted graph structure into several files,so that the subsequent graph neural network model can be read.We convert the software vulnerability detection problem into a classification problem of the source code graph structure,realize the end-to-end learning of attribute information and structure information,and give the corresponding formal definition.We use graph convolutional neural network models for graph representation learning problems to learn the local and global information of graphs.In representation learning,we divide it into vector representation learning of nodes and vector representation learning of the wholegraph.The node representation learning phase introduces code content as node attribute information to improve the quality of node learning.The graph overall representation learning phase proposes an algorithm for splitting subgraphs and a READOUT model based on subgraph self-attention.The READOUT model of node attention and the READOUT model based on self-attention subgraph and node attention are designed and evaluated experimentally.Experiments show that the graph structure representation method of the code proposed in this paper and the corresponding end-to-end feature learning network can achieve good results in big data in vulnerability detection.Finally,in order to facilitate the operation of developers,we encapsulate each module and build a graphical interface,design and implement a prototype system,and test its functions.
Keywords/Search Tags:Software Vulnerability Detection, Graph Neural Network, Intermediate Representation, Representation Learning
PDF Full Text Request
Related items