Font Size: a A A

Research And Implementation Of Security Vulnerability Automatic Detection System On Source Code

Posted on:2022-05-12Degree:MasterType:Thesis
Country:ChinaCandidate:Y ChenFull Text:PDF
GTID:2518306332967119Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet in the 21st century,computer software plays an important role in daily life.However,many computer software generally have security vulnerabilities,which may lead to various serious consequences,such as information loss,information leakage,system failure,etc.It is an important research topic to mining the vulnerability in source code efficiently and automatically.At present,vulnerability mining methods are mainly divided into manual audit,static analysis and dynamic analysis,machine learning.However,the existing methods have the following problems:it needs to consume a lot of human resources,which makes the cost of code maintenance high;Many methods are based on rules to identify vulnerabilities,or to screen vulnerabilities by executing dynamic path coverage code.The accuracy of vulnerability detection is not high,and new vulnerabilities cannot be identified,and there are high false positives and false positives;Although some machine learning methods can learn independently,the model has limitations,and has high dependence on data,and the effect of vulnerability detection is not ideal.In order to solve the problem of low accuracy of vulnerability detection and serious data dependence,and improve the performance of source code vulnerability mining,this paper collects sufficient training data based on the code submission information of GitHub open source project,and combines with word vector model(word2vec),BiLSTM(bi-directional long short term memory)and attention mechanism.In addition,the automatic detection system of source code security vulnerabilities(VDS)is designed and implemented.The core content of this paper includes three aspects:(1)Today,there are few public datasets for Python language vulnerability mining.Based on the code submission records of GitHub open source project,this paper filters,preprocesses and marks the data sets containing seven different vulnerability types,which can be used for model training and testing in this paper to improve the generalization ability of the model;(2)This paper innovatively proposes the model based on deep learning source code vulnerability mining model,which integrates Word2Vec module,BiLSTM(Bi-directional Long Short-Term Memory)module and Attention module.Word2vec module is trained in vulnerability free code data to obtain high-quality vector expression of code keywords.Bilstm module extracts robust temporal features.Attention module processes features of bilstm model through dynamic weighting to extract feature expression ability and significantly improve accuracy.The whole model is trained in the code data with vulnerabilities,and the experimental results show that the accuracy of the model is 96.5%.In addition,the model works at the code token level,which can fine-grained locate the specific location of code vulnerabilities;(3)This paper designs and implements a VDS software system based on Django framework,which includes visual user interface,multiple functional modules(input,storage,management,vulnerability detection),VDS can provides one-stop vulnerability detection service.
Keywords/Search Tags:network security, vulnerability mining, Word2vec, BiLSTM, Attention mechanism
PDF Full Text Request
Related items