Font Size: a A A

Research On Mismatch Problems In Image Steganalysis Under The Network Environment

Posted on:2018-04-04Degree:DoctorType:Dissertation
Country:ChinaCandidate:X D HouFull Text:PDF
GTID:1318330563951162Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
As a new way of ensuring information security,information hiding has always received extensive attention for almost twenty years.Research on information hiding techniques represented by steganography and steganalysis,meets the significant requirement of national information security and has profound academic significance and application value.So for the image steganalysis techniques have yielded substantial research results,which show superior detection performance under the laboratory environment.However,we should note that the existing steganalysis is always treated as a problem of binary classification,and rests on the assumption that a steganalyst has complete knowledge of the cover source and steganographic way(i.e.,steganographic algorithm and payload size)used by the steganographer.However,in reality,such knowledge might be completely unknown or partially unknown,and even corresponding training samples might be rare;these conditions inevitably result in the so-called model mismatch problems,thereby reducing detection accuracy dramatically.Although several studies have attempted to reduce the classification error caused by model mismatch,it is still difficult to resolve the problems radically,thus hindering the practical application of steganalysis under the network environment.Consequently,this dissertation attempts to analyze different combinations of two-dimensional knowledge and whether the corresponding training samples are available,and proposes a series of methods suitable for image steganalysis under the network environment by utilizing relevant techniques consisting of outlier detection,clustering,similarity retrieval and feature selection,thus resolving the model mismatch problems and promoting steganalysis from the laboratory into the network environment.The main contributions of this dissertation can be summarized in the following aspects:1.In such a paradigm with all knowledge of the cover source and no knowledge of the steganographic way used by the steganographer,the existing approaches can not obtain overall good performance on both known(already existing)and unknown(previously unseen)steganographic algorithms.Motivated by these observations,we explore a universal blind steganalysis via reference points-based Local Outlier Factor(LOF)and Low-All sampling.First,aided stego images are embedded by as many known steganographic algorithms as possible with varying payload.Second,we compute LOF scores of aided stego sample points(feature vectors)with respect to the test sample points.Third,we choose stego images with the lowest LOF scores from aided stego images as training stego images.Finally,we train a binary classifier on cover images and chosen training stego images for test.Experimental results confirm that the proposed approach performs significantly better than the existing approaches on both known and unknown stego algorithms.2.In such a paradigm with no knowledge of the cover source and no knowledge of the steganographic way used by the steganographer,to avoid the mess of model mismatch,a new unsupervised universal steganalysis framework,called Similarity Retrieval of Image Statistical Properties(SRISP)-aided unsupervised outlier detection,is proposed to detect individual stego images.First,several cover images with statistical properties similar to those of the given test image are searched from a retrieval database to establish aided cover samples.Second,unsupervised outlier detection is performed on a test set composed of the given test image and its corresponding aided cover samples to determine the type(cover or stego)of the given test image.To demonstrate the effectiveness of the proposed framework,a bitmap compression history retrieval-aided unsupervised outlier detection method is presented to deal with the steganalysis problem of heterogeneous bitmap images with different compression history.The method employs a low-dimensional steganalytic feature set and three basic unsupervised outlier measures.Extensive experiments on six spatial stego algorithms show that the proposed framework has the following advantages:(1)it does not suffer from model mismatch,because training is not required;(2)it is universal in the sense that it may detect already existing and new steganographic algorithms;(3)the introduction of SRISP mitigates the effect of cover variation on the existing steganalysis features;(4)it exhibits superior performance compared with traditional unsupervised outlier detectors and One-Class Support Vector Machine(OC-SVM),and is particularly robust to the proportion of stego images in the test samples.3.In order to investigate whether the proposed SRISP-aided unsupervised outlier detection framework is compatible with high-dimensional steganalytic features as well as the effect of the degree of cover variation on its performance,an image content retrieval-aided unsupervised outlier detection method is proposed to handle the steganalysis problem of raw uncompressed images with different texture complexity.First,several cover images with texture complexity similar to those of the given test image are searched using 36-dimensional texture features from a retrieval database to establish aided cover samples.Second,unsupervised outlier detection is performed on the given test image and its aided cover samples to determine the type(cover or stego)of the given test image.The method utilizes four steganalytic feature sets of different dimensions,and two basic unsupervised outlier measures as well as five measures specially for high-dimensional data.Extensive experiments show that:(1)the greater the cover variation,the more obvious the performance improvement of the proposed framework;(2)the proposed framework retains the universality,while exhibits reliable performance when applied to small cover variation and high-dimensional steganalytic features;(3)the existing or new unsupervised outlier measures can also be applied to the proposed framework.In addition,we also discuss the effect of the proportion of stego images in the retrieval database on the proposed framework,and a noise image removal method is used in advance on the retrieval database to make the proposed framework practical.4.We consider a particular paradigm of steganalysis,that is,(1)no-knowledge of the steganographic way and no knowledge of cover source but(2)having small training samples with the same cover source and steganographic way as those of the testing samples,where the cover images always significantly outnumber the stego ones.We call this paradigm Highly Imbalanced Steganalysis with Small Training samples(HISST).Researchers have rigorously studied sampling and learning algorithms as well as feature selection approaches to the class imbalance problem,but the research in the steganalysis domain is rare.In particular,feature selection has rarely been studied outside of text classification and biological data analysis.Thus,we evaluate eight different feature selection metrics with three different classification algorithms on four representative steganlytic features.We found that(1)feature selection with the classifier Fisher linear discriminant alone can effectively overcome the HISST problem even for very high-dimensional steganalytic feature sets;(2)on the average,Fisher and rank correlation coefficient optimization is an ideal candidate for feature selection in terms of performance and optimal feature number in not very high-dimensional steganalytic feature space,whereas feature assessment by sliding thresholds is the best choice in extremely high-dimensional feature space.In addition,we also present a systematic comparison of the three types of methods(i.e.,sampling,learning algorithms and feature selection)and their combinations.We show that feature selection exhibits superior performance over sampling technique and new learning algorithms in most cases,and the combinations of various approaches do not produce more improved results.The most notable finding is that when the sample number increases or the degree of unbalance decreases,feature selection gradually loses its superiority in resolving imbalanced image steganalysis,and even achieves worse performance than linear SVM when all features are used;this trend is particularly obvious for very high-dimensional steganalytic feature set.Finally,the whole dissertation is concluded and the next research topics of image steganalysis under the network environment are given.
Keywords/Search Tags:Steganalysis, Universal Blind Steganalysis, Local Outlier Factor, Binary Classification, Similarity Retrieval, JPEG Compression, Unsupervised Outlier Detection, Image Content, Texture Analysis, Class Imbalance, Sampling, Learning Algorithm
PDF Full Text Request
Related items