Font Size: a A A

Research On News Text Classification Based On Deep Learning

Posted on:2022-05-30Degree:MasterType:Thesis
Country:ChinaCandidate:W WangFull Text:PDF
GTID:2518306575969199Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
The rapid development of the Internet age has led to the explosive growth of news data and the lack of effective management.It is increasingly difficult for readers to quickly obtain valuable information.Therefore,it is a meaningful work in text classification to quickly search for valuable information from a large amount of news text information.Traditional machine learning can no longer meet the high-efficiency news text classification task,and the news text classification based on deep learning has attracted the attention of scholars.However,the existing research methods still have some problems,such as directly merging the title and the content,thus ignoring the importance of the title.And the singularity of the classification model leads to a lower classification effect.To solve these problems,this thesis mainly carried out the following work:Due to the low accuracy of news text classification,this thesis proposes a two-way long and short-term memory network model based on the attention mechanism,which takes into account the importance of title.In the construction and training of this model,the content and the title are processed separately,and the attention mechanism is introduced to highlight the more important words in the news.The specific steps are as follows: Dot product the word vector of the news title and the word vector of the content to obtain the attention weight.Then the word vector representation of the content is weighted,so that the more important words in the news are enhanced.The experimental results show that on the Sogou laboratory data set,the accuracy of the Bi LSTM-ATT model is higher than that of the Bi LSTM model.Aiming at the problem that the singularity of the classification model leads to the low classification effect,a fusion model based on the capsule network is designed.This model combines the advantages of the Bi LSTM model for long text sequence representation and the CNN model for extracting local features.Perform word vector representation and feature extraction on news text.Finally,the obtained news text information is aggregated through the capsule network to obtain the output capsule and complete the text classification.The experimental results show that the Capsule-CBATT text classification fusion model proposed in this thesis has a significant improvement in accuracy,recall,F1 value and accuracy compared with the CNN model and the Bi LSTM model.
Keywords/Search Tags:Text Classification, Attention Mechanism, BiLSTM, CNN, Capsule Network
PDF Full Text Request
Related items