| Nowadays,with the development of computer technology,the number and complexity of software needs to grow constantly,People’s Daily needs to be met,followed by increasingly severe software security problems.In recent years,cases of using software vulnerabilities to destroy and causing serious consequences have emerged in an endless stream.How to accurately and efficiently mine software vulnerabilities has become a growing concern.Among them,buffer overflow vulnerability is the most widespread and destructive vulnerability in software security.An attacker can use this vulnerability to destroy the program stack,causing program paralysis,and even executing the attacker’s instructions to cause greater losses.Traditional buffer overflow vulnerability mining methods require researchers to have a full understanding of buffer overflow vulnerabilities,and need to analyze the program code manually.Mining vulnerabilities based on preset rules requires a large workload and low efficiency and accuracy.In order to mine buffer overflow vulnerabilities more efficiently and accurately,this paper designs a model that uses deep learning technology to mine buffer overflow vulnerabilities at the level of program source code.The main work of this paper is as follows:(1)Firstly,this paper studies the formation principle of buffer overflow vulnerabilities,and analyzes the common sensitive functions that are easy to cause buffer overflow vulnerabilities.After that,the program source code is parsed into a code attribute diagram,and the semantic information in the source code is fully extracted.Based on the common sensitive functions obtained from the analysis,the vulnerability information is obtained,and the source code preprocessing algorithm is proposed.Finally,through the preprocessing algorithm,the code with possible vulnerabilities is obtained from the code and the vulnerability code block is generated.(2)Based on the parsed code attribute graph,this paper analyzes the call methods of buffer overflow sensitive functions,uses different slicing methods to construct vulnerability code blocks into code slices for different function call methods,and proposes a code slicing construction algorithm.After that,this paper preprocesses the code slices and uses Word2 vec technology to vectorize the code slices.(3)In this paper,a vulnerability mining model composed of Bi-GRU-Attn network is completed,and two optimization methods are used to optimize the model,and realize the automatic buffer overflow vulnerability mining at the code level.The bidirectional GRU network realizes the comprehensive acquisition of context content,the use of particle swarm method can realize the efficient acquisition of the optimal weight of the network model,and the use of Keepy method ensures the full use of the coding layer vector h.At the same time,the model introduces the attention mechanism and carries out different weight training for different important information.(4)In this paper,the CWE-119(buffer error)data set in the benchmark data set SARD is used to test the model.I do some experiments to verify the effectiveness of the proposed method,and verify the effectiveness of the source code preprocessing method,the code slice construction method,the attention mechanism and two optimization algorithms respectively.Finally,the model is compared with the latest methods.The final data show that this model can effectively excavate the buffer overflow vulnerability,and has better results in the accuracy rate and false positive rate. |