Font Size: a A A

Word Shift Steganography And Its Detection Algorithm

Posted on:2010-06-03Degree:MasterType:Thesis
Country:ChinaCandidate:L J LiFull Text:PDF
GTID:2178360302959671Subject:Information security
Abstract/Summary:PDF Full Text Request
The rapid development of Internet enhances people's requirement on secret communication. People hope to protect the content of the communication as well as the communication itself. So the attackers would not see the existence of secret com-munication, which is the goal of steganography. So, it becomes the hot topic in the information security.The previous work of steganography focused on those methods using image or audio as carriers. However it is not convenient for such big files to be transferred on the Internet. That will also arouse the attackers'suspicion. Although the text file has low redundancy, which is weak for hiding a large amount of secret information (usually the bit string), its small file size make it the ideal steganographic carrier in the Internet environment. More and more institutes and scientists begin to note this advantage of text files and put more energy into the research of text steganography.Among all text steganographic algorithms (i.e. using texts as the steganographic carrier), format based text steganography has merits of convenience in realizing, computational cheap and large steganographic capacity. Therefore, the format based method is mostly used text steganographic method. Contrary to the rapid development of steganography, its analysis (i.e. steganalysis) on text steganography is still in its infancy because of the complexity of the characteristic behinds text document. This paper conducts deep research on a typical format based steganography, word shift steganography. Two different detection methods are proposed. Because PDF docu-ments are the most regular text carriers on Internet, this paper uses PDF documents as steganographic carriers.Firstly, we give a steganalysis on a specific word shift algorithm. Based on the deep analysis of the word space in PDF documents, we propose the concept of"neighbor difference", study the distribution of texts'neighbor differences and con-struct the estimated distribution of stego texts'neighbor differences. At the end, we judge whether the unknown text is stego text or natural text by using chi-square test.Furthering the above idea, we devise a blind method to detect all kinds of word shift steganography. The biggest difficulty for blind detection lies in the fact that dif-ferent word shift algorithms change the different partial characteristic of texts. It is hard to find a"universal"statistic, which is sensitive to all the space modifying. However, the blind detection is the most useful method among various detection me- thods. Our blind detection method can find out whether the text words are maliciously shifted without knowing the specific steganographic algorithm. We find those statis-tics which reflect the special characteristic behind the text word spaces and collect them as feature vector. Out main idea is to use support vector machine(SVM) to learn the difference between natural text's features and stego text's features. The unknown text is labeled as natural one or stego one with that knowledge. We not only take use of the concept of"neighbor difference"mentioned above, but also propose the new concept of"environment equal". All of the features we select proximately describe the change of text word spaces.The good performance of our detection method is proved by experiments. The attack against specific algorithm can detect 95% stego texts when embedding rate is greater than 5%. Our blind method's detection rate can reach 93% without knowing which word shift algorithm is used.All of the text documents used in the experiments are randomly downloaded from Internet, containing scientific papers, technical manuals, fictions, blogs and so on.Word shift steganography is one of the most typical format based steganography. This paper's work makes contribution to other text steganalysis. The ideas and tech-niques in this paper can help to solve problems in detecting other format based stega-nography. At the same time, text steganography can draw lessons from this paper's analysis to improve the algorithm's security and secrecy.
Keywords/Search Tags:information hiding, steganography, text steganography, word-shift ste-ganography, steganalysis, chi-square test, blind detection, neighbor dif-ference, environment equal, PDF documents, text format
PDF Full Text Request
Related items