Font Size: a A A

Research On Linguistic Steganography And Steganalysis

Posted on:2013-07-02Degree:DoctorType:Dissertation
Country:ChinaCandidate:P MengFull Text:PDF
GTID:1228330377951665Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Information hiding (IH) is an ancient technology, and is also a young scientific field. In ancient Greece, people had used information hiding technology to transmit secret message in the war. In ancient China, there were also many records about using information hiding technology for communication, such as Acrostic. However, information hiding technology was known and researched by very few people until the end of20th Century, when the Internet was widely used.With the wide popularity of the Internet, it not only greatly facilitate the people to obtain and distribute digital contents on the Internet, at the same time, it also brings new challenges.How to protect the Internet content Copyright? How to detect whether a digital content has been tampered? How to transfer message along the Internet without been detected? How to prevent terrorists from transmitting secret command over the Internet?A series of questions are put forward, making people rethink about information hiding technology and research it. With information hiding technology, one can embed the author information to a digital content to protect the copyright. With information hiding technology, one can embed control message to a digital content to detect whether a digital content has been tampered. With information hiding technology, one can embed secret message to a cover image or text or video to transmit on the Internet without been detected. With information hiding detection technology, one can know who are using information hiding technology on the Internet.The relationship between information hiding and detection is like the relationship between encryption and decryption. They not only compete with each other, but also help each other forward. When many people use information hiding technology, people will research information hiding detection methods. When most of information hiding technology can be detected, much securer information hiding technology will be designed.Information hiding technology can be classified by many ways. In accordance with the type of carrier, it can be divided into image-based method, audio-based method, video-based method and text-based method. Text-based information hiding technology can also be classified to many kinds, such as format-based method, type font method, character distance/line distance based method, and natural language method. Because text documents are widely used on the Internet, research the information hiding and detection methods of text document is very important for national security and social stability.In this paper, natural language information hiding and detection methods will be introduced. Some information hiding detection methods are designed first, then, based on the research results on information hiding detection, some methods to enhance the security of information hiding system are given.Concrete research results of this paper are as follows:1) Design a steganalysis method based on statistical natural language model, and the method can detect many text information hiding systems, such as NICETEXT, TEXTO, Markov-based method. The experiment results show the method can detect much smaller size text than the previous methods, and the detection accuracy is higher than the previous methods about10%.2) Translation-Based Steganography (TBS) is a new kind of representative text information hiding method. In this paper, a detection method on TBS is designed. The method need know some information of TBS, like language pair, machine translator set. The effective of the detection method is theoretically analyzed by a mathematical method. The detection process and experiment results were also presented on the paper. Both the experiment results and theoretically analysis show the detection method can be used to classify stego-text, natural language text and machine translated text.3) A blind method for Steganalysis of Translation-Based Steganography (STBS) is given. STBS doesn’t need know any information of TBS. First, the word and N-gram frequency difference between normal text and stego-text is shown, then, a preprocessor is designed to refine all the given texts to expand the frequency differences between normal texts and stego-texts.12dimensional feature vectors sensitive to frequency are derived from the refined texts. Finally, a SVM classifier is used to classify given texts to normal texts and stego-texts. A series of experiments is given to demonstrate the performance of STBS.4) Present a novel Translation-Based Steganogarphy (NTBS), which is resilient against the current statistical attacks. A classification accuracy upper bound between normal translated text and the stego-text was estimated by building a mathematical model. When the text size is1000sentences, the maximum classification accuracy is about59%. The experiment results also show current steganalysis methods cannot detect NTBS.5) Design a hash-based text information hiding method, named HashHide. With HashHide, the sender and the receiver just need share a secret key before transmitting secret message. HashHide will significantly reduce the shared message between the sender and receiver, which will improve the security of the information hiding system. The embedding rate of HashHide can achieve90%of the theoretically maximum embedding rate.6) A novel Chinese text steganography method based on character forms is proposed and three embedding methods are given. The embedding rate and efficiency of the embedding methods were analyzed and compared. Security analysis and the methods to increase the security of the embedding methods are also given.In the achievements described above:Point1) is a general text steganography detection methods, which improved the detection accuracy of current methods. It is a method innovation. Points2) and3) are two detection methods for TBS. If one knows some information about TBS, the method of Point2) can be used to detect it, the detection accuracy is very good, if one doesn’t know any information about TBS, the method of Point3) can be used, and the detection accuracy is also acceptable. Points2and3) are application innovation. Point4) is a much securer TBS, named NTBS, whose security is proved by a mathematical method and experiments. It is a theory innovation. Point5) is a general data embedding method which can be used widely in text steganography. Point6) is a simple and practical steganography method for Chinese text, it is an application innovation.
Keywords/Search Tags:Information Hiding, Translation-Based Steganography (TBS), LinguisticSteganalysis, Linguistic Steganography, Machine Translation, TextInformation Hiding
PDF Full Text Request
Related items