Font Size: a A A

Research On The Improved TextCNN Garbage Barrage Recognition And Filtering Algorithm Combined With AdaBERT

Posted on:2022-03-30Degree:MasterType:Thesis
Country:ChinaCandidate:R A SunFull Text:PDF
GTID:2518306548961039Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of the network,more and more forms of entertainment appear in people's vision.The "barrage",a new way of expression,is popular among young people because of its uniqueness.Barrage is a kind of comment information that appears on the video along with the video playing.In the development of the barrage,there is a problem that the garbage barrage affects the appearance.At present,the research on the garbage barrage is still relatively small.(1)In order to reduce the impact of junk barrage on video perception,this paper constructs an improved TextCNN model combined with AdaBERT to improve the method of junk barrage recognition.(2)In this paper,TextCNN is used to identify the information of different dimensions of barrage text,and the word vector of AdaBERT model is used as the word vector of TextCNN in this paper.Then batch standardization is added to TextCNN,and mish activation function is used to reduce the possibility of gradient disappearance,so that the final combination of AdaBERT's improved TextCNN model can obtain better generalization ability.(3)The model proposed in this paper can be used to identify and filter the junk barrage,so as to purify the barrage environment,improve the user viewing experience,and promote the healthy development of the video website.In this paper,five centimeters per second,the sky of the young winddog and the live broadcast of Bili Bili New Year's Eve in 2020 are used as experimental data sets to verify the effectiveness of the proposed model.The improved TextCNN model combined with AdaBERT proposed in this paper is used to recognize the barrage data set together with TextCNN,Bi LSTM and BERT-TextCNN.Comparing the results of the experiment,it can be concluded that the model proposed in this paper has high performance.The model has the highest accuracy rate,recall rate and F1 value,and can effectively identify and filter junk barrage text.It can be seen from the experiments that using batch standardization and mish activation function can speed up the convergence speed of the model and improve the performance and efficiency of the model.Compared with the BERT-TextCNN model,the model in this paper is more efficient and shows the importance of adaptive task.The model parameters obtained from AdaBERT are less and can better represent the characteristics of barrage text,which can improve the accuracy of garbage barrage text recognition.
Keywords/Search Tags:Text classification, Barrage, TextCNN, AdaBERT
PDF Full Text Request
Related items