Font Size: a A A

Reasearch On Software Vulnerability Detection Technology Based On Static Taint Analysis And Deep Learning

Posted on:2022-01-19Degree:MasterType:Thesis
Country:ChinaCandidate:X D YanFull Text:PDF
GTID:2518306572959749Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years,with the application of Internet technology in various industries,the number and scale of software have shown explosive growth.In the process of developing the program,due to the negligence of the developer or the limitation of the programming language,the program will have defects,which will be discovered and used by malicious attackers.As software security issues become more prominent,researchers are paying more and more attention to the research of vulnerability detection.However,due to the complication of software structure,it is far from being able to meet the ever-increasing testing demand only to rely on researchers for manual review.With the widespread application of deep learning,vulnerability detection based on deep learning has been a new research direction.In the past,when processing source code based on deep learning technology for vulnerability mining,the source code was usually processed as natural language text and the sequence-based recurrent neural network model was generally used for training.Although the defects of manually extracting vulnerability features were avoided,the semantic structure information of the source code is ignored,and the data flow and control flow information implicit in the source cod e cannot be included.The analysis result has a high rate of false negatives and false positives.For the above problems,this paper studies a vulnerability detection model based on the combination of static taint analysis and deep learning.First ly,this paper studies the static taint analysis method based on patch comparison.According to the patch comparison,the difference file is generated,which can accurately locate the problem location of the program,so as to avoid analyzing the entire code file,and then pay attention to the partial code related to the vulnerability.It further reduces the complexity of the analysis and improves the accuracy of the analysis.This paper selects the taint source in the generated difference file according to certain t aint source selection rules and then locates the difference node on the control flow graph according to the difference line obtained from the difference file,and perform s backward taint propagation analysis at this node according to different taint propag ation rules and forward pollution propagation analysis,the backward pollution path and forward pollution path are obtained in the analysis process,and the pollution path is obtained by splicing with the difference line.Then,the vulnerability detection model based on deep learning is studied,which realizes the classification of vulnerabilities.In order to better learn the characteristics of the vulnerability,this paper standardizes the pollution path,that is,removes information such as strings and comments,and standardizes the replacement of user-defined variables and functions.In order to generate recognizable input for the deep learning model,this paper uses a word embedding model to vectorize the standardized pollution path.Finally,a detectio n model is generated based on the two-way long and short-term memory neural network to learn long-term dependence information,learn the data dependence and control dependence relationship in the pollution path,and automatically extract the vulnerability characteristics,thereby improving the detection effect.Experiments have proved that the pollution path generation method based on static taint analysis and the vulnerability detection model based on deep learning shown in this paper show good detection results on the data set.
Keywords/Search Tags:Vulnerability detection, Taint analysis, Recurrent neural network, Control flow graph
PDF Full Text Request
Related items