Font Size: a A A

Research On Binary Software Vulnerability Mining Technology Based On Machine Learning

Posted on:2020-08-01Degree:MasterType:Thesis
Country:ChinaCandidate:Y RongFull Text:PDF
GTID:2428330572472258Subject:Information security
Abstract/Summary:PDF Full Text Request
As the amount of code and code complexity continue to increase,there are more and more vulnerabilities that are easily exploited by attackers and cause logic errors in the original program.In order to detect and fix vulnerabilities in software as early as possible,binary software vulnerability mining technology has become one of the hot topics in security research.The binary vulnerability detection model using machine learning has the advantage of being able to process large-scale data with fast detection speed and low detection cost.However,because binary level software can't directly express program information,it can't extract effective features from it,which leads to the existing binary vulnerability mining methods based on machine learning often has high false negative and false positive.this paper combines machine learning and natural language processing technology to propose a binary feature extraction method,then design and implement a vulnerability detection system on Android platform.The main work and results of this paper are as follows:1.By studying the binary file preprocessing and word embedding technology,a feature vectorization model based on Basic language is proposed.Through this model,the feature vector containing the context relationship of the words in the assembly instruction can be preliminarily constructed from the binary file.2.Through the research on Deep Neural Network,the Att-BLSTM vulnerability feature extraction model is proposed.The core of the model is the bidirectional long-term and short-term memory network(BLSTM)and the attention mechanism,which can extract features containing rich program semantic information from binary files.3.Through the vulnerability information collected from the vulnerability information publishing platform from 2000 to 2018,this paper creates a dataset of the Android platform dynamic link library(a binary file).Prior to this,this article did not find vulnerability datasets for the Android platform dynamic link library on the Internet and other papers.4.Based on the two models mentioned,this paper designs and implements a binary vulnerability detection system on the Android platform.The test results show that compared with the existing machine learning-based binary vulnerability mining method,the proposed model can better learn the program semantic information of binary files.The vulnerability detection system designed based on the models can achieve a maximum accuracy of 93.86%.
Keywords/Search Tags:software vulnerability mining, machine learning, word embedding, BLSTM, attention mechanism
PDF Full Text Request
Related items