Font Size: a A A

Research And Implementation Of C-Language Vulnerability Static Detection Based On Flow-analysis And Graph Neural Networks

Posted on:2022-01-23Degree:MasterType:Thesis
Country:ChinaCandidate:X ChengFull Text:PDF
GTID:2518306341982309Subject:Cyberspace security
Abstract/Summary:PDF Full Text Request
In recent years,cyberspace security issues have become more frequent,and most of the cyber attacks come from the risk of the source code of software itself.Unnecessary property losses caused by source code security issues have also increased significantly.Facing the explosive growth of source code security issues,how to detect source code vulnerabilities in a timely and accurate manner has become a hot research question.Traditional static code detection tools are inefficient in the bug detection of increasingly complex modern software,especially for high-level abstract logical vulnerabilities without precise vulnerability specifications.The development of machine learning,especially deep learning techniques,has provided an insight for automatic and accurate vulnerability detection.However,most of the existing deep-learning-based research on vulnerability detection directly uses simple code features(e.g.,program text)and apply traditional neural networks such as CNN and RNN.Therefore,they suffer from low detection precision.This thesis proposes innovations in two parts,code feature extraction,and neural network,and combines traditional program analysis and deep learning effectively.This thesis utilises state-of-the-art graph neural networks to embed the structural semantics of the source code to support precise source code vulnerability detection.This thesis first proposes a detection model,namely VGDetector,based on the control flow graph and graph convolutional neural network.It uses the control flow graph to represent the program's execution order,thereby characterizing the program's execution logic,and leverages the graph convolutional neural network to learn the feature of the graph's nodes and edges.The well-trained model can be used for the automatic source code classification.In order to preserve deeper program semantics(data flow)and perform more fine-grained detection(code lines),this thesis further proposes a graph-neural-network-based detection model(DeepWukong)using flow analysis and code slicing.It utilizes code slicing techniques to retain the control flow and data flow related to the vulnerability,thereby embedding the code more accurately,which can be used for the automatic classification of source code fragments.This thesis adopts the ten most common C/C++vulnerabilities to evaluate the two detection models proposed in this thesis.We have collected and labeled 105,428 real programs and choose four popular static detection tools and three state-of-the-art deep-learning-based vulnerability detection systems as our baseline.The experimental results show that VGDetector and DeepWukong can significantly improve C-language vulnerability static detection and be effectively applied to real-world vulnerability detection.
Keywords/Search Tags:Source Code Vulnerabilities, Program analysis, Deep learning, Graph neural network
PDF Full Text Request
Related items