Font Size: a A A

Research On Binary Code Vulnerability Analysis Technology Based On Intermediate Representation

Posted on:2020-07-24Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y LiuFull Text:PDF
GTID:2428330572972257Subject:Information security
Abstract/Summary:PDF Full Text Request
With the exposure of various software vulnerabilities,software security incidents emerge one after another.Software vulnerabilities have become an important factor affecting software security.Vulnerability detection has also become an important issue in the field of information security.According to the different vulnerability detection objects,the existing vulnerability analysis methods can be divided into:source level vulnerability detection and binary level vulnerability detection.Although the current vulnerability analysis technology of source code is relatively mature,they cannot be directly applied to detect security defects in binary code because the coding and structural characteristics of binary code are different fr-om those of source code and due to the lack of semantic information and the complexity of analysis,there are few mature binary code vulnerability detection tools available.However,more and more software is released in binary form.It is very urgent and necessary to explore the methods and techniques of vulnerability detection for binary code.This paper takes Android SO files as the resear-ch object,proposes a novel intermediate language,BinaryLift that can be used for vulnerability detection and a new binary code semantic feature representation ProF.Then proposes a vulnerability detection system based on BinaiyLift and ProF.The main works of this paper are:1.Design and implementation of the intermediate language BinaryLift:This paper analyzes the limitations of the existing intermediate language of the binary code,and proposes a novel intermediate language BinaryLift.BinaryLift transforms from the assembly language of the binary code,t accurately and completely reflects the original information of the code,and some of the high-level semantic information of the program is restored in the process of translation,which make up for the lack of analysis of the assembly language directly.In addition,this paper provides a detailed grammar definition of the BinaryLift,making it possible to analysis the grammar of the binary code.2.Design and implementation of semantic feature ProF:In order to improve the accuracy and efficiency of vulnerability detection,this paper analyzes the common vulnerabilities of binary code,and further transforms BinaryLift into a novel binary code semantic feature ProF.The ProF describes the expected result of a special action under an optional condition.It combines the function call information,data flow information and control flow information,which is a multi-dimensional binary code semantic feature.3.Vulnerability Detection and Effectiveness Evaluation:In order to achieve automated vulnerability detection and avoid the introduction of human factors in the vulnerability detection process,this paper using natural language processing technology to extract the features in the program automated and build a vulnerability detection model based on the existing data.Through a series of comparative analysis,this paper verifies the effectiveness of the BinaryLift and the ProF.The experimental results show that the average detection F1-measure of BinaryLift can reach 80%,which is 7%higher than the detection result using the assembly language and 3%higher than the result of Vine IL,which is used in Bitblaze.And the ProF is 14%higher than BinaryLift's test results,reaching 94%,which is 20%higher than the sequence calls of standard library proposed by Grieco.Through the analysis of the vulnerability samples,the practicality and interpretability of the ProF are further proved.
Keywords/Search Tags:binary code, vulnerability analysis, intermediate representation, intermediate language
PDF Full Text Request
Related items