Font Size: a A A

Research On Multimodal Fake News Detection Based On Contrastive Learning

Posted on:2024-08-23Degree:MasterType:Thesis
Country:ChinaCandidate:Y TangFull Text:PDF
GTID:2568306941988599Subject:Electronic Science and Technology
Abstract/Summary:PDF Full Text Request
Automatic fake news detection technology has a broad application prospect in regulatory authorities’ review,platform violation content cleaning,radio and television media content self-inspection,and major public opinion warning,and has important research significance and social value.Fake News Detection aims to analyze the content and social network propagation characteristics of news through automated technology and machine learning algorithms to identify false information.Fake news detection is often expressed as a binary classification problem,which divides an online news article into two categories:true news or fake news.Traditional research on fake news detection mainly focuses on fake news in the form of text.However,with the support of mobile internet and social media for multimedia content,the form of online fake news has also changed.Fake news with pictures and videos is more attractive and credible than simple text and is being accelerated through social networks.At the same time,the multimodal content of news provides more comprehensive and rich information than the single mode,and the feature complementarity and semantic relevance between different modes also provide new detection clues.Therefore,multimodal fake news detection has gradually become a current research hotspot.Multimodal fake news detection methods are content-based fake news detection methods,mainly targeting the most common fake news with pictures in social media.The existing research on multimodal fake news detection has two main problems:(1)Insufficient information extraction capabilities for news multimodal features,especially for visual features,resulting in visual modal information not being fully applied to the final news detection decision-making process.(2)Less attention is paid to the relationship between news text and images(semantic consistency between images and text),and the interpretability and generalization performance of the detection model is poor.These problems limit the performance of multimodal fake news detection systems and hinder the practical application of detection algorithms.In response to the above issues,the following work has been carried out in this paper:(1)Aiming at the problem that existing detection methods have poor performance in extracting multimodal features of news,this paper first analyzes the feature extraction skeleton network of existing methods,improves the visual feature extraction steps,and proposes a multimodal Fake News Detection Framework based on Vision Transformer(MDVT).The MDVT framework mainly utilizes the ability of Visual Transformer to better learn the characteristics of image internal relationships compared to CNN,enhancing the multimodal feature extraction ability of the model.This article further combines the latest technological advances in pretraining language models and uses BERT’s variant model to optimize text feature extraction.Through experiments on different multimodal vector fusion methods and different combinations of feature extractors on the MDVT framework,a preferred detection model using MacBERT and Swing Transformer as feature encoders,MDVT-MS,was selected.After experimental verification on public databases,the MDVT-MS model has improved the accuracy rate by 3.7%compared to the advanced baseline model CMC in recent years,achieving competitive multimodal fake news detection performance.(2)Regarding the calculation of semantic consistency between news images and text,existing methods based on news image and text relationships often use heterogeneous feature encoders on the visual side and the text side,and only learn image and text relationships from downstream task data sets,making it difficult for models to overcome the"semantic gap" between modes,resulting in poor accuracy in calculating news image and text similarity.For this reason,this paper introduces the Chinese vision-language pretraining model as a semantic feature encoder for news images in the fake news detection task for the first time and proposes CLIP-MFD(CLIP for Multimodal Fake News Detection),a multimodal news image and text consistency measurement model based on contrastive learning.Firstly,the common sense of image and text association is obtained through a multimodal pre-training model,and then a cross-modal contrastive learning training strategy is designed to narrow the distance between the image and text features of real news and push the distance between the image and text features of fake news,achieving the connection between images and natural language in vector space.Through experiments,this paper proves that image text consistency can be an effective feature of multimodal false news detection,and further uses an integrated learning algorithm to combine the MDVT-MS model and the CLIP-MFD model for fake news detection.The detection accuracy of the integrated algorithm exceeds the CMC model by 4.0%,which is superior to most recent methods,ensuring high accuracy while also being interpretable.(3)This paper focuses on the MDVT-MS model and the CLIP-MFD model,using front-end and back-end programming frameworks such as Vue,Spring Boot,and Flask to design and implement an online multimodal fake news detection system.Through efficient and rapid detection functions,the practical value of this paper’s research is verified.
Keywords/Search Tags:artificial intelligence, fake news detection, multimodal learning, comparative learning, integrated learning
PDF Full Text Request
Related items