Code Context Representation Learning For Vulnerability Detection

Posted on:2022-01-08

Degree:Master

Type:Thesis

Country:China

Candidate:X C Zhang

Full Text:PDF

GTID:2518306572960089

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

With the vigorous development of computer software,the number of software vulnerabilities is also increasing rapidly.Vulnerability repair is becoming more and more important.The traditional code review has higher requirements for the professional quality of software practitioners,and with the increase of software scale,only relying on code review can not meet the requirements of vulnerability inspection;The rule-based vulnerability automatic checking technology relies on the rules defined by experts to check the code;Traditional machine learning methods need to extract features manually to check vulnerabilities;In recent years,the development of deep learning provides a new research direction for vulnerability detection.However,there are some problems in the existing research,such as the incomplete use of code structure information,the extraction of code global information and the lack of focus on local information.To solve the above problems,this paper proposes a learning method of code context representation for vulnerability detection.The specific work is as follows:Firstly,we extract the abstract syntax tree of source code,slicing and cross function calling code,and extract the long path for the abstract syntax tree.We extract The control flow graph and program dependency graph of source code and cross function call code,and the node2 vec algorithm is used to generate representation vectors for the nodes of control flow graph and program dependency graph.Then,we propose a vulnerability code detection model based on context representation,which uses sequential neural network bilstm to learn the long path representation vector,which is called local context representation;Node2vec is used to learn the graph representation vector of CFG and PDG.According to the node order of the long path,the graph representation vector is embedded and fused with the local context representation vector to get the global context representation vector.The self attention mechanism is used to weight the global context representation to give higher weight to the vulnerability related vectors,Then the global context representation is input into the full connection layer for vulnerability detection.The experimental results show that the detection effect of this method is better than the existing vulnerability detection methods,and the F1 value of vulnerability detection on real data set FFmpeg is increased by 2.8%.

Keywords/Search Tags:

Software vulnerability detection, Deep learning, Code context representation, Program slicing, Long path

PDF Full Text Request

Related items

1	Software Vulnerability Detection Method Based On Code Semantic Vector Representation And Deep Learning
2	Research On Software Buffer Overflow Vulnerability Detection Method Based On Deep Learning
3	Research On Vulnerability Detection Techniques Of Binary Program Based On Program Analysis And Testing
4	Research And Implementation Of Software Source Code Security Vulnerability Representation Learning And Detection Technology
5	Program Vulnerability Detection Through Learning On Code Text And Control Structure
6	An Approach For Using Deep Learning To Detect Code Vulnerabilities
7	Research On Security Detection Of Open Source Software For Source Code
8	The Study And Implementation Of Software Vulnerability Detection Based On Large-scale Open Source Repositories
9	Research On Software Vulnerability Detection Method Based On Code Property Graph And Graph Convolutional Neural Network
10	Research On Software Vulnerability Prediction Method Based On Deep Transfer Learning