| In today’s world,the competition around key Internet resources and international rules of cyberspace is becoming more and more intense,and incidents that endanger information security occur from time to time.Network covert channel,as a technology that uses normal carriers to secretly transmit information without being discovered by a third party,can easily bypass the information system security policy and avoid the security detection of regional border protection equipment,directly causing the leakage of secret information in the high-security network.There are a large amount of streaming media and many types of applications based on VoIP in the network.Moreover,as an interactive network streaming media with multiple carriers,VoIP has the properties of instantaneity,dynamics,and randomness,and has gradually become an excellent steganographic carrier.With the continuous advancement of steganography technology,the steganography integrated with speech compression coding has gradually become a representative algorithm in network steganography.Its high concealment greatly increases the difficulty of "online detection" and "offline analysis",and brings great challenges to the existing security framework.To deal with these challenges,many algorithms for detecting VoIP steganography based on compressed speech coding have been proposed successively,and good detection results have been achieved.However,in general,there are still some challenges in the existing algorithm:1.Modeling steganographic carriers is not comprehensive enough,and the extracted features reflecting the changes of the carrier before and after steganography are insufficient.2.For steganographic samples with small sizes and low embedding rates,there is still room for improvement in detection accuracy,which cannot meet the requirements of online detection.3.The dimension of extracted features based on feature engineering continues to increase,resulting in high complexity and low detection efficiency.4.Most of the mainstream algorithms are based on supervised learning,relying on a large number of the labeled samples.The training costs are high and the generalization ability is poor.In view of these challenges,this dissertation analyzes the principle,process and engineering implementation of steganalysis based on carrier probability distribution and statistical structure.Furthermore,according to the different stages of VoIP speech coding and different types of carriers,the correlation of coding parameters is analyzed,and the feature vectors that can effectively distinguish the changes of steganographic carriers can be extracted from multiple dimensions.Combining machine learning and deep learning,three dedicated detection algorithms are implemented.The main work and innovations of this dissertation are as follows:1.Steganalysis Model Based on Statistical Structure of VoIP Streaming MediaIn view of the properties of VoIP streaming media,such as wide application,large traffic,and long duration,even in the case of small-sized samples and low embedding rate,the covert channel can be maintained all the time,which brings a great potential threat to network information security.Based on the distribution of VoIP streaming media,starting from the carrier probability distribution distance and carrier statistical structure,this dissertation proposes a detection theory and practical method for VoIP steganography.The model describes the steganalysis method based on distribution distance through probability distribution theory,and proves the detectable range of steganalysis based on the Hellinger distance.Then from a practical point of view,a general framework and method for steganalysis based on carrier statistical characteristics is proposed to guide the algorithm design and effectively deal with the challenges brought by carrier fragmentation and low embedding rate steganography.2.AMR Steganalysis Algorithm Based on Multi-domain Information FusionTraditional machine learning-based steganalysis methods for compressed speech have achieved great success.However,existing methods face the dilemma of the effectiveness of modeling speech carriers and extracting high-dimensional features,which leads to increased detection costs and reduced efficiency.To deal with these challenges,this dissertation proposes a steganalysis method for compressed speech based on multidomain information fusion (named MDoIF).Aiming at the distribution characteristics of the fixed codebook indexes of different pulse tracks,multi-domain feature vectors that can effectively distinguish the change of the steganographic carriers are extracted.By collecting the information of various features before and after steganography,a special steganalysis algorithm for AMR steganography is realized.At the same time,MDoIF adopts feature selection based on information theory measurement to realize the transformation from high-dimensional features to low-dimensional features,which has been proved to significantly improve the performance of MDoIF.Experimental results show that even for speeches with a length of 400 ms,MDoIF can effectively detect Geiser’s steganography at different embedding rates,and is up to 11.26 % ahead of the SOTA method.3.QIM Steganalysis Algorithm Based on Hierarchical Attention NetworkWith the in-depth research on QIM steganalysis in communication security,the steganalysis algorithm based on machine learning has achieved great success,however it still faces the following challenges:1.The increase in feature dimensions leads to increasing computational complexity.2.The existing QIM steganalysis technology still has continuous improvement needs and space under the condition of small sizes and low embedding rates.To address these challenges,this dissertation uses Bayesian network to measure the correlation of the quantified index codewords,and gives the calculation method of correlation strength.Furthermore,a layered attention network oriented to codeword correlation is proposed (named F3SNet),which can automatically extract complex correlations between quantized index sequences with the help of the layered attention mechanism and achieves efficient detection for short samples and low embedding rate steganography.Moreover,by mapping the quantized index codewords to low-dimensional vector spaces that can represent rich spatio-temporal features,F3SNet solves the problem of high-dimensional feature space explosion.Experimental results show that even for speeches with a length of 1,000 s,F3SNet can effectively detect QIM steganography under an embedding rate of 10 % and outperformed FCEM by an average of 5.27 %.4.QIM Steganalysis Algorithm Based on Semi-supervised LearningIn view of the fact that existing deep learning-based steganalysis algorithms rely on a large number of the labeled samples and have low generalization performance,In this dissertation,facing the weak signal characteristics of VoIP steganography,a novel semi-supervised detection algorithm named SSLadNet is proposed by combining unsupervised learning of denoising autoencoder and supervised learning of feedforward coding.SSLadNet utilizes the denoising autoencoder to generate latent variables,and tries to preserve the underlying details that can reconstruct samples.It can assist the supervised learning part to obtain top-level abstract.invariant,and discriminative classification features.At the same time,the advantages of recurrent neural network that can well model the temporal sequence characteristics such as speech,are introduced into the ladder network to achieve the efficient detection of QIM steganography under a number of the labeled samples.Experimental results show that even for the labeled samples with a number of 512,SSLadNet can achieve a detection accuracy of 96.09 %for 1,000 ms long samples and 100% embedding rate,and outperformed the SOTA method based on semi-supervised learning. |