Because of the sharing and communication of varieties of digital information getting more and more convenient, the infringement of software products is currently getting more and more serious. Among the three main ways of software infringement: software piracy, software tampering and reverse engineering, the software piracy is most harmful for the software industry. Every year the global software industry loses 15 billion U.S. dollars for the influence of the software piracy. As an effective technique against the software piracy, the software watermarking technology is getting more and more attention.Software watermarking technology is a way to embedding the copyright information or user identity information into software products for the purpose of protecting the software product's intellectual property. When the software product is infringed by the software piracy, the software copyright holder can extract the watermark information in the software to protect copyright or use the watermark extracted to find the information on the distributor of the pirated software. Software watermarking technology can be divided into Static Software Watermarking and Dynamic Software Watermarking, the main difference between the two technologies is that the Static Software Watermarking embeds the watermark information directly into a program's content, while the Dynamic Software Watermarking embeds the watermark information into the runtime structures of a program. Currently the Dynamic Graph Watermark (DGW) which in one of Dynamic Software Watermarking is the research hotspot, the Collberg/Thomborson algorithm proposed by Collberg and Thomborson is the first DGW algorithm, its main idea is to encoding the watermark information into a particular graph structure, and then inserting the code segment which can generate the graph structure into the program, in the watermark information recognition phase, the graph structure will be generated in the memory heap according to the user's special input, finally extracting the graph structure from the heap and decoding it into the watermark information. The most important of DGW is the encoding and decoding of the graph structure, at present, the main graphics encoding schemes are as follows:Radix-K linked list, Permutation Graph, RPG, PPCT and the two proposed in recent years IPPCT, PIPPCT. Constructing a graph structure which has high date rate and a certain self-recovery capacity is the research hotspot currently.At present, the researches on software watermarking technology are mainly divided into three areas:proposing a new software watermarking technology, improving watermarking technology application protocol and evaluating the software watermarking technology. Among the three areas, the software watermarking technology evaluation has significant meaning for analyzing the advantage and disadvantage of a watermarking algorithm, proposing a new watermarking algorithm and selecting the best watermarking algorithm according to the particular feature of difference programs. This article's main contents is to comprehensively evaluate the present main software watermarking algorithms. However, by summing and analyzing the previous relevant research contents on the watermarking algorithm evaluation, we found that the studies generally have non-uniform evaluation standards and lack quantitative analysis, therefore, this article will use the six unified standard as follows to evaluate the software watermarking algorithms:1. Credibility:In the process of watermark information recognition, the software watermarking algorithm should extract the correct watermark information same to the information embedded.2. Data-rate: The algorithm should have a high data-rate to permit the embedding of a reasonable sized watermark information.3. Overhead: Embedding a watermarking should have little impact on the performance of the program, and the process of watermark embedding and recognition should be quick.4. Part protection:To protecting the watermark information, it should be spreaded throughout the program.5. Resiliency: The watermark algorithm must resilient to varieties of software watermark attacks such as obfuscation and optimization.6. Stealth: The embedded watermark information should be similar to the embedded program or general program which will makes the attacker more difficult to discover and locate the watermark information.In this article, we will evaluate the current main software watermarking algorithms from four standards:credibility, stealth, resiliency and data-rate. For credibility, we consider that the three situations which maybe appear in the watermark recognition phase such as False-Positive, False-Negative and the watermark extracted not same to the one embedded before, will influence the credibility of one watermark algorithm, so evaluating credibility is to checkout the probability of the three situations described above in the recognition process of a watermark algorithm, the smaller the value of this probability is, the better the credibility of this watermark algorithm; For stealth, we assumes that the distance value between the feature of a program embedded watermark information and the feature of the general programs stands for the stealth, the smaller the distance value is, the better the stealth of this watermark algorithm is. We mainly discuss the stealth form two aspects:static and dynamic feature distance, static feature distance is the distance between the frequency of various instructions in the program embedded the watermark information and the one in the general programs, dynamic feature distance is the distance between the frequency of various instruction pattern(continuous n instructions,2≤n≤4) in the program embedded the watermark information and the one in the general programs; For resilience, we assumes that the capability of the watermark algorithm resisting to currently known watermarking attack types stands for the resilience. At present, the major attack ways is as follows:Subtractive attack, Additive attack, Distort attack and Collusive attack, and among them, the Distort attack is most harmful, it can carry out various semantics-preserving transformations on the programs without influencing the functions of the original program, therefore, this article mainly uses the Distort attack to measure the resilience; For data-rate, we consider that the efficiency of the watermark algorithm represents particular sized watermark information stands for the data-rate. Date rate can be divided into Static Data-rate and Dynamic Data-rate, for static watermark algorithms, its data-rate is Static Data-rate which is measured by the size of the watermark information adding to the content of program per bit, for dynamic watermark algorithm, its data-rate has both Static and Dynamic Data-rate which is measured by the size of watermark information adding to the source needed at runtime(mainly memory).In order to carry out the evaluation, we use Sandmark as the platform of watermark algorithms evaluation experiment. There are 16 currently main watermarking algorithm and 39 obfuscation algorithm, as well as various evaluation tools, in this article, we use these algorithms and tools to evaluate software watermarking algorithms with the 6 standards described above. Because of the plugin framework infrastructure used by Sandmark platform, researches can implement their own watermarking algorithm in Sandmark, in this article, we implement two new DGW graphics encoding schemes:IPPCT and PIPPCT, and describe the whole process in detail.After serious experiment, we get the data of credibility, stealth, resilience and data-rate of those watermarking algorithms:Add Expression,Add Initialization,Add method and Field,Add switch,Davidson-Myhrvald,GTW,Monden,Qu/Potkonjak,Register Type,Static Aribot,Steganograph,Stern,String Constant,Collberg/Thomborson,Dynamic Aribot and Execution Path, after analyzing and summing up those data, we assume that all the watermarking algorithms has good credibility, the Collberg/Thomborson algorithms has the best stealth and resilience, and the graphics encoding scheme:PIPPCT has the best data-rate. Finally, we believe these data will provide some help to the future watermarking algorithm evaluation research. |