Reverse engineering starts with the executable program systems, and generates the source code, the structure of system, and the documents of correlative design theories and algorithms. Reverse engineering has great practical and economic value. It can not only avoid reduplicate labor and improve efficiency and quality of software production, but also change a plenty of legacy systems into the systems that can be evolved easily in order to utilize these useful resource adequately. Although there has been much research on reverse engineering in recent years, reverse engineering is still a greatly undeveloped field. The uniform standard, method and process haven't been generated yet. Thus, it is necessary to research on the technique of reverse engineering deeply.Reverse engineering is applicable to the various stages of the software life cycle and various abstract levels, including demand, design and implementation. It can be used to the lower level of abstraction, such as put the binary code into source code. But for the source code will be converted into a higher level of abstract level, such as control flow graph, data flow diagram and class relationships graph.Reverse engineering is generally defined as the process consists of two steps: First, analysis of the target system and marking system object and their relationship. The second create a higher level of abstraction or different forms system.In this thesis the author lucubrates on several problems of the techniques of reverse engineering through the aspects of theory, method and application, emphasizes the study of the information collection, information extraction and the abstract said of information in the field of reverse engineering, and the design of a framework of program reverse analysis based on relational database.This paper first introduces the conception of reverse engineering and some tools used in it. And the development of reverse engineering is also discussed in the paper.A program reverse analysis framework based on relational database is presented in this paper. The framework is based on the theory of relational model and takes each component in program languages as a separated entity which has relations with other components.In the framework, the information is stored in form of relational database by taking relations between entities into its attributes. Through the use of the framework, the program information is retrieved from the extended source program by the technology of program understanding and analyzing, and is stored in a relational database, then is showed visually by database application, which helps fast understanding of software.At the end, the paper describes the framework carefully, taking source programs of C language as an example. |