| With the rapid development and spreading of software products, the protection of the software copyright has been paid more and more attention. Meanwhile, the software security and piracy have become an important problem. According to the statistics, the dollar losses due to software piracy were estimated at 29 and 34 billion in 2003 and 2005 separately, and the software piracy rate has risen increasingly. The software industry is challenged to face such difficulties. Methods of the traditional cipher technology can be used to protect software products. But it only encrypts the software products when they are transmitted on the Internet and can't solve the problem totally. Software watermarking is just the technology to deal with the problem. It embeds a secret message into the cover program to protect the intellectual property from stealing or prove the ownership after the stealing occurred.Software watermarks can be classified into two categories: static watermarks and dynamic watermarks. Static watermarks can be embedded and retrieved easily but have a poor robustness. However, dynamic watermarks are resilient to attacks and have become the main field of the software watermarking technology. Now there are a lot of dynamic software watermarking schemes. Dynamic graph watermarking(DGW) technology is the representative one. The main idea of the scheme is: embed the watermarks into the topology of the DGW and the watermarks can be resilient to a lot of semantic-preserving attacks because they are built at the execution time. The key is needed to retrieve the watermarks from the heap in the memory.There are lots of attacks that attackers may launch, for example, subtractive attack which the attacker tries to modify the watermark without damaging the usability of the software program, additive attack which the attacker inserts his own watermark to override the original one, distortive attack which the attacker tries to make the watermark useless without damaging the usability of the software program, collusive attack which the attacker compares the different versions of the watermarked programs and locates the positions of the watermark.To further strength the robustness of the watermarks, tamper-proofing technology is introduced into all kinds of schemes. It makes attackers to tamper the watermarks more difficult and breaks down the program when it is altered.In this article, a novel tamper-proofing software watermarking scheme was proposed based on constant encoding technology. Its main ideas is: construct a many-to-one function with the watermark pieces and protect the constants whose values are equal to the result of the function, then distribute the pieces into opaque predicates in the program to form the interaction. The situation of one or more pieces destroyed can be felt due to the interaction characteristic and the program will not work correctly, which improves the tamper-proofing ability. It can be divided into five phases:1. watermark splitting: Use the splitting algorithm to divide the watermark W into pieces wi of k(1≤i≤k),which k can be used as the key of the algorithm.2. watermark encoding: encode the pieces wi of k into PPCT and produce the source codes.3. watermark embedding: Embed the source codes into the program and the positions of watermarks should avoid the following situations(1) watermark shouldn't be embedded in the hot positions.(2) the source codes of watermarks may not be executed.4. watermark tamper-proofing: construct a many-to-one function H with the watermark pieces of k and protect the constants whose values are equal to the result of the function, then distribute the pieces into opaque predicates in the program to form the interaction.5. watermark retrieving and recovering: Given the key I0,I1,…,Ik, The watermark can be retrieved in the heap and recovered to the original one W by the anti-splitting algorithm to prove the ownership.Above scheme presents the many-to-one function at the aspect of encryption. We summarize the advantages and disadvantages as follows:1. Introduce the splitting algorithm with parameters which can be used as the key of the algorithm. It has a high data rate because we can adjust the size and range of the watermark pieces.2. Distribute the pieces encoded by CLOC into the whole program. They are relatively small to improve the stealthy and robustness.3. Present function H to implement the many-to-one decoding function. Function H makes the decoding function independent of the mask function. The decoding function partly dependent of the pointers as well as the encryption can improve the resilience to the pattern-matching attack. Meanwhile the decoding function improves the stealth without producing the middle result.4. Distribute the pieces into opaque predicates in the program to form the interaction. The situation of one or more pieces destroyed can be felt due to the interaction characteristic and the program will not work correctly, which improves the tamper-proofing ability.5. Embed the watermarks by means of obfuscating technology which makes the program more difficult to understand and the watermarks similar to the other codes.6. The data rate and resilience are paradoxical. The effectiveness of the program will be decreased if much more pieces are distributed in it. Also the scheme can't resist the distortive attack completely.We implemented the tamper-proofing scheme on a concrete platform. Chose two programs of medium size as our samples and made them attacked on the SandMark. We counted the bytecodes of the samples with the statistical module. It proved to be stealthy and robust.Finally, we summarized this article, pointed out our next work and predicted the future of the software watermarking technology. |