Font Size: a A A

Research For Measuring Software Similarity Based On Graphs

Posted on:2010-07-27Degree:MasterType:Thesis
Country:ChinaCandidate:G N WangFull Text:PDF
GTID:2178330338982302Subject:Computer applications
Abstract/Summary:PDF Full Text Request
Driven by economic interests as a result, software theft is very common in the software industry. Most of Software theft duplicate a whole software product or steal a part of a product (e.g. modules and code fragments) and make some changes before reusing it in other product without permission. The technology for program plagiarism detection within source code has matured but much more vulnerable to the impacts of code transformation. Therefore, it becomes a very important to detect the similarity between two programs without source code. A research focus on studing similarity of dynamic behaviors when program execute gradually becomes a hotspot in recent years, In this paper, the research on program`s behavior based on dependent relationship between system calls helped us to define a new dynamic birthmark, and an algorithm was also proposed to compare the similarity of birthmarks. The main contents are as follows:First, we collected and analyzed system call sequence the program made, and divided them into sub-sequences of varying length in accordance with local behavior in program. Then we selected a part of its parameters, between which the relationships by mining as the basis for description of program`s behavior. Based on which a graphical language for specifying relationships between the parameters was defined to complete the process of graphical modeling on program behavior.Secondly, a dynamic birthmark was defined for programs base on graphical model of program`s behavior. Its definition is: behavior subgraphs in program after graphical modeling and its probability appear in program. The probability here described was defined as each subgraph the number of acts of the total number of the ratio map。Then, an algorithm for comparing similarity of two dynamic birthmark was proposed, and an metric for calculating it was also given.Finally, the experimental evalution were made for proposed birthmark by detecting similarity of two programs in several groups with different function. The result shows that it can accurately detect the similarity between two programs in groups of similar implementation and the difference between two programs in groups of different implementation, which proves the credibility.of proposed birthmark as well as its applicability. In addition, compared with previous research, we analysed the robustness of proposed birthmark aiming at several existing API obfuscation technology, and pointed out its advantages and disadvantages of performance taken in safety considerati- on. The future work was specified in three aspects:the improvement of applicability and accuracy in graphic modeling on programs`s behavior, the definition of similarity threshold and the improvement of safety performance of dynamic birthmark.
Keywords/Search Tags:software theft, similarity, dynamic program birthmark, graphs, system call
PDF Full Text Request
Related items