| With the rapid development of the information revolution,the development process and speed of human society are being profoundly affected,people’s production and lifestyle are being comprehensively changed,and people’s lives have been fully occupied by online social media.However,the ease and low cost of information dissemination has led to a flood of rumors on social media.The proliferation of rumors has affected people’s daily lives,damaged the credibility of social media platforms,and caused panic in social order.Nowadays,the social impact caused by the spread of rumors and the research value of rumor detection have also attracted great attention from the academic community.Although there have been breakthroughs in rumor detection technology in recent years,there are still many challenges in the field of rumor detection.First,the existing research data in the field of rumor detection is small in volume,and the data is old and outdated,which can no longer satisfy the existing rumor detection research.Secondly,the current rumor detection methods have low utilization efficiency for existing data and cannot fully exploit the maximum value of existing data.Finally,there is less research on unsupervised methods in the field of rumor detection,and the effect is poor,so how to improve the effect of unsupervised rumor detection methods is also a problem worth studying.In view of the above difficulties,this thesis carried out researches respectively,and obtained the following research results:1.Propose a dataset of Chinese Weibo rumors:This thesis collects rumor and non-rumor data from the Chinese social media platform "Weibo" and proposes a new Chinese Weibo Rumor Dataset(CWRD).The dataset contains 26,176 rumored tweets and 35,429 non-rumored tweets,for a total of 61,605 tweets.On the basis of proposing this data set,our follow-up rumor detection work will carry out experimental research on this data set.2.Supervised Algorithms Based on Data Augmentation and User Modeling:This thesis proposes a supervised rumor detection algorithm based on data augmentation and user modeling.In this thesis,different augmentation data are generated using three augmentation methods:augmentation based on word replacement,augmentation based on back-translation,and augmentation based on a similar sentence generation model.At the same time,according to the characteristics of the data of rumors,this thesis models the information of users,and then fuses the textual information of rumors with the modeling information of users.Experiments show that this method effectively improves the detection effect of the rumor detection model and enhances the generalization and robustness of the model.3.An Unsupervised Clustering Algorithm Based on Contrastive Learning:This thesis makes a creative exploration on the research of rumor detection in the unsupervised field,and proposes a new end-to-end clustering algorithm for unsupervised rumor detection.Based on the basic idea of contrastive learning,the algorithm uses text data to enhance the positive and negative sample pairs required for contrastive learning,and uses contrastive learning to optimize the representation vector of rumor texts.Then,a stronger unsupervised rumor detection effect is achieved through joint training of contrastive learning and clustering learning.Experiments show that the unsupervised rumor detection method proposed in this paper achieves better results. |