Font Size: a A A

Research On Control Flow Reconstruction Of Multi-source Decompilation

Posted on:2014-11-06Degree:MasterType:Thesis
Country:ChinaCandidate:D LiFull Text:PDF
GTID:2268330401976784Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Control flow reconstruction based on binary codes, which is one of the hot in the field of software reverse engineering, plays an important role in software understanding and maintenance etc. Multi-source decompilation refers to decompilation technique for variety of processor architectures and binary file formats. The traditional control flow reconstruction technique is based on specific processor architecture and its executable file. The aim of this thesis is to resolve the problem of control flow reconstruction of binary in different processor architecture based on a unified technology framework and provide a normalization solution for control flow reconstruntion of software transplantation, vulnerability mining and malicious code detection under different architecture. The study of this thesis also applies to solve the problem of platform-dependent of data flow and data dependence in multi-source decompilation.In this thesis, the control flow reconstruction in multi-source decompilation is based on a platform-independent intermediate language. First, different binay codes are lifted up to intermediate language representation via platform-dependent front-end and follow-up analysis, such as control flow, date flow, and data dependency, can be performed based on the intermediate language via platform-independent back-end. In order to effectively address the problem of platform scalability and semantic unity issues, the key point in this thesis is the code conversion technique in front-end. By introducing a two-level conversion technology, the conversion can be realized from object code under different processor architecture, such as x86, ppc32, and amd64etc, to BAIL intermediate language which is of unified semantic representation format. Based on an initial control flow graph, a control flow graph correction technique based on dynamic execution is proposed to improve the completeness and accuracy of the control flow graph, in which the constraint in static control flow graph and dynamic information of tracking execution are used to resolve the recognition of indirect branch targets.In this thesis, we construct a prototype system of control flow reconstruction in multi-source decompilation, and provide a scalable interface. This makes it possible to support control flow reconstruction, data flow and data dependencies under a new architecture. The test results indicate that the prototype system can accomplish the works of code conversion and control flow reconstruction which will build the foundation for multi-source decompilation.
Keywords/Search Tags:Multi-source Decompilation, Intermediate Language, Control Flow Reconstruction, Indirect Branch Instruction, Trace Constraint, Dynamic Execution
PDF Full Text Request
Related items