Font Size: a A A

Research On Key Technology Of Vulnerability Mining Based On Deep Learning

Posted on:2021-02-15Degree:MasterType:Thesis
Country:ChinaCandidate:N GuoFull Text:PDF
GTID:2518306308476814Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the development of the Internet,the role of software in the network system is becoming more and more important.Nearly all information systems and commercial applications provide software-based services.For example,e-commerce,online banking,and fast travel are all carried out in the form of web pages or mobile apps.These software-style applications are built from a large amount of code,and generally have a long development cycle,so there are likely to be various security vulnerabilities.Security vulnerabilities not only affect the software and the server itself,but also pose a threat to users,resulting in information leakage,property damage and other consequences.Therefore,how to efficiently mine the source code is an important research topic,which is of great significance to the development of software application architecture and the construction of network security.The existing vulnerability detecting methods can be summarized into three types:First,manual code audit,which relies on the experience of security experts to review the code.This method cannot guarantee the quality of vulnerability mining,and with the increasing number of codes,relying on human resources more and more unrealistic;the second is a rule-based code security detection tool.Although this method has certain vulnerability detecting capabilities,it is prone to high false negatives and false positives.The third is a detection method based on code data flow,which detects whether there is a security risk stain tracking.This method is commonly used in commercial vulnerability mining software,which is expensive and difficult to promote,and it is less effective for detecting new or variant types of vulnerabilities.In order to solve the above problems,this paper proposes a key technology of vulnerability detecting based on deep learning.The core of this technology contains two important aspects.The first is an innovative bytecode-based feature extraction method.Bytecode is the intermediate result of source code compilation and running,which can abstractly represent the vulnerability features.Existing research and technology are all extracting source code features or rules,and the source code is difficult to accurately express the vulnerability.This paper uses code processing to automatically extract byte code slices of the vulnerability,and then converts it into a digital vector,removing meaningless characters in source code.That is more conducive to algorithm learning and avoid overfitting.The second is an innovative deep learning model,which is different from the traditional classification model.This paper uses LSTM neurons as the basis for constructing neural networks.The neurons are bidirectionally linked and divided into two groups.Both the target code and the vulnerability template are received.For each input,the similarity between the two sets of inputs is calculated and the loophole is determined based on whether the final calculation result exceeds the set threshold.At the same time,this paper designs and implements a vulnerability detecting system called Vulnerability Hunter(VulHunter).The system has a visual operation interface and can be input in multiple ways.After starting the task,the system will automatically extract the bytecode slice,convert it into a digital vector,and then use the deep learning model to calculate the similarity between the target code to be detected and the vulnerability template,and determine whether there is a vulnerability based on the similarity value.In order to evaluate the effectiveness of the system,this paper uses PHP software as an example to detect SQL injection and cross-site scripting(XSS)vulnerabilities.Experimental results show that the system can achieve F1-measure of 88%(SQL injection)and 95%(XSS)when detecting a single type of vulnerability,and can reach an F1-measure value of more than 90%when detecting multiple types of vulnerabilities.In addition,it has a lower false positive rate(FPR)and false positive rate(FNR)compared to existing methods or tools.In practice,this paper uses VulHunter to detect three real PHP software(SEACMS,ZZCMS,and CMS Made Simple).Five vulnerabilities have been discovered,and three of them have not been disclosed before.
Keywords/Search Tags:Vulnerability mining, Deep learning, Network security
PDF Full Text Request
Related items