Digital image generation and transmission has become very easy due to the rapid development of modern mobile devices.Meanwhile,the operation of image editing software is simple,which makes it easy for anyone to modify the picture.Generally speaking,people modify the image for the purpose of beautification and entertainment.However,some forged images may be abused maliciously,causing negative impact on the society and the country.Therefore,it has become increasingly important to detect image manipulations.This thesis focuses on image splicing forgery detection.Image splicing is defined as copying and pasting a part of an image into another image to merge a new image.Currently,there are two main methods for detecting tampered areas in images:traditional feature extraction based detection methods and CNN based detection methods.Traditional detection methods can only detect a specific image fingerprint,but when the specific fingerprint in the image does not exist or is not obvious,the detection will fail.The CNN based tamper detection method can simultaneously extract multiple image fingerprints,making up for the shortcomings of traditional methods that rely on a single image attribute,lack generalization ability,and poor robustness.However,due to the inherent locality of convolution operations,it is difficult for CNNbased methods to learn explicit global semantic information relations,and it is difficult to combine local and global features.Therefore,most detection methods based on CNN can only deal with limited scale variation.In addition,these methods may encounter problems such as incomplete localization or high false detection rates when locating large-scale tampered areas.The main research contents of this thesis include the following aspects:(1)An image splicing forgery detection algorithm based on Res DU-Net is proposed.This algorithm aims at the problem that U-Net semantic segmentation networks lack multi-scale semantic information when locating tampered regions with different scales.This article makes two improvements.First,Res DU-Net uses Residual blocks to replace ordinary convolutional blocks,which can prevent the gradient regression problem of deeper networks and increase the feature extraction ability.Second,Res DU-Net mixes dilated convolutions with different dilation rate,allowing different receptive fields to obtain more semantic information at different scales,thereby better locating tampered regions at different scales.And dilated convolution also reduces the defect of losing spatial information caused by pooling operations.(2)A U-shaped hybrid Transformer network image splicing forgery detection algorithm is proposed.This algorithm aims at solving the problems that CNN based detection methods may encounter when locating large-scale tampered areas,such as incomplete localization or high false detection rates.Based on the Res DU-Net architecture,a two-layer nested deep U-Net architecture is designed,and a Transformer is mixed.It combines the advantages of convolution and self-attention mechanisms for image splicing forgery detection and location.First,in the encoding process,different size receptive fields are mixed to extract the features of the tampered image.Secondly,the self-attention module explicitly models the complete context information by using the global interaction between the semantic features at the end of the encoder.In addition,this thesis designs a novel cross attention module in skip connections,which enhances the low-level feature map under the guidance of high-level semantic information,and filters non semantic features,thereby achieving more detailed spatial information recovery in the decoder and improving the accuracy of prediction results.Finally,the features after the attention module are input into the decoder for decoding.In the decoding phase,this thesis uses the learned features to estimate the final tamper mask.Through this design,the final tamper mask can obtain local and global information at the same time.Through a large number of experiments,the algorithm has achieved better performance than the existing methods on the two image tampering common data sets,Casia2.0 and Columbia. |