In the era of the Internet,online social media and other multimedia platforms have become the primary means of information dissemination,replacing traditional newspapers and TV.These platforms have large numbers of users who can freely publish information,resulting in a significant increase in the amount of information available online.However,the convenience of the Internet means that people can access this information at any time,and the social attributes of online social media mean that its spread and influence are far greater than traditional channels for accessing information.Unfortunately,online social media platforms are not always responsible for the accuracy of information published by users,leading to the rapid spread of fake news and causing serious harm.With the advancement of network transmission technology,the number of multimodal fake news containing text and images is increasing.Studies have shown that multimodal fake news spreads faster on online social media compared to text-only fake news,indicating that it has a wider negative impact.Therefore,it is crucial to detect and address the spread of multimodal fake news to reduce the harm caused by fake news.Current multimodal fake news detection methods focus on detection the relation between images and texts by extracting image and text features using Recurrent Neural Networks(RNN)or Convolutional Neural Networks(CNN).However,this approach faces issues in utilizing and combining text and image features optimally.Specifically,the current methods face problems in underutilizing statistical features in text and images and corrupting shallow image features.Additionally,there are limitations in feature fusion,as these studies do not consider text present in the image(Optical Character Recognition(OCR)text),the original content text,or the correlation between images and image quality features.To resolve these issues,this thesis proposes the following approaches:(1)An Information Extension based Multi Modal Fake News Detection Method(IEMM)is proposed to study the correlation between images and news texts.From the perspective that fake news images are more visually appealing,IEMM extends the multimodal feature information by extracting shallow statistical features from images and texts and investigates the relationship between the fused image features and news texts by fusing them with the semantic information of the original texts and images.The method consists of five parts: a visual feature representation part,a news text feature representation part,a lexical feature representation part,a multimodal relevance feature representation part,and a feature fusion part.This IEMM-based detection method can reflect the characteristics of fake news in text and image to a certain extent,fuse image statistical features with semantic information to compensate for the shallow VGG feature impairment,and further determine the correlation between image and text,and finally combine the information of news itself to detect fake news.The experimental results prove the effectiveness of the method.(2)A Multimodal Fusion Framework for Fake News Detection via Multi-Attention Mechanism(MFMA)is proposed,which not only considers the semantic features and relevance features of images,content texts,and OCR texts in the news but also considers the multimodal fusion features among them.Specifically,the framework MFMA consists of several modules that use the pre-trained model CNN correlation model to extract the semantic and qualitative features of images,the bi-directional long and short-term memory network(Bi-LSTM)and the Attend module in decomposable attention to extracting the semantic and relevance features of texts(news content and OCR texts)The "Co-Attention" is used to generate multimodal fusion features between images and text.Experimental results on real datasets show that the MFMA framework outperforms other methods. |