Font Size: a A A

Research Of Rumor Detection Based On Bidirectional Pre-trained Language Model

Posted on:2022-03-27Degree:MasterType:Thesis
Country:ChinaCandidate:H Y XianFull Text:PDF
GTID:2518306338466984Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the popularity of online social networking,online rumors will have a very large impact on society.How to accurately identify rumors on online social platforms is particularly important for maintaining social order.At this stage,network rumors are mainly detected by traditional machine learning methods or models based on deep learning.These methods are affected by the directivity of comments and the incomplete feature caused by the excessively long sequence.The judgment of network rumors is still failed to achieve a relatively high accuracy rate.This thesis proposes corresponding improved methods for network rumor detection in response to the above problems,uses pre-trained models for fine-tune to speed up the convergence of the model,and solve real-time problems to a certain extent.The specific results are as follows:1.Aiming at the problem of nested comments with misdirection under hot topics,this thesis proposes a method of separating and redirecting nested social comments and correcting comment samples.In this method,nested social comments are separated and redirected by the idea of union-find disjoint sets data structure,and sample correction is carried out by comment filtering.The experimental results show that compared with the method combining the original topic and social comments directly,the new method can construct text features with less noise.2.Aiming at the problem of incomplete model input features due to the excessively long sequence of the input samples of the neural network model,this thesis proposes an interception method based on context setting step size,a social comment data filtering strategy based on sentiment analysis,a sentence vector representation method combining POS Tagging and TF-IDF weighted average and a sentence vector representation method based on BERT pre-trained model.The results of several groups of comparative experiments show that the proposed method can construct more complete features and achieve better classification effect compared with the method that directly splices the original topic and social comment data.3.Based on the idea of pre-training,this thesis improves the random word masking algorithm for pre-training task in Bert model to increase the effectiveness and enhance the stability of pre-training.And Secondly,a new pre-training task is designed to enable the model to measure the fluency of the sentence,so that the Bert model can better understand the semantic meaning of the context.The experimental results show that the accuracy of the improved model is 1.5%higher than that of the original model.
Keywords/Search Tags:Rumor detection, Neural networks, BERT, Pre-training
PDF Full Text Request
Related items