Font Size: a A A

Research On Chinese News Text Classification Method Based On CNN Mixed Model

Posted on:2022-01-25Degree:MasterType:Thesis
Country:ChinaCandidate:Y P LiuFull Text:PDF
GTID:2518306326966109Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Text Classification involves all aspects of the field of Natural Language Processing.How to classify the complex text information and extract effective information for the use of the society has become a hot and difficult research topic at present.Neural network method has been applied to Text Classification task,but it still has some shortcomings.Convolutional Neural Networks focuses on extracting local features of text,but ignores global information.Recurrent Neural Network pays attention to extracting long-distance information of text,but the training time is long and the model training is difficult.For general Chinese news,its text is long and rich,and a single Convolutional Neural Networks cannot extract many features of the text,and the text classification precision of the Convolutional Neural Networks still needs to be further improved.Aiming at the above problems,this thesis puts forward the mixed neural network model based on Convolutional Neural Networks and the mixed neural network model based on optimized Temporal Convolutional Network respectively to study the Chinese news text classification models.The main research contents are as follows:(1)In view of the problem that Convolutional Neural Networks tends to ignore the correlation between the local and the whole,a Chinese text classification model(CCNN-2SE-m Att)based on Squeeze-and-Excitation block and Convolutional Neural Networks is proposed.Squeeze-and-Excitation block are used to model the dynamic and nonlinear dependence between channels in the Convolutional Neural Networks,which makes up for the problem of weak correlation between the local and the whole,and improves the performance of the model.Then,the Multi-Head Attention mechanism is introduced to learn the information of different subspaces,and the importance distribution of features in each subspace is calculated to improve the accuracy of classification.Experimental comparison was made between this model structure and other model structures on THUCNews dataset and Sogou CS dataset,and the results showed that this model structure was superior to other model structures in evaluation index Precision rate,Recall rate,and F1-Score values.Compared with the classical character-level Convolutional Neural Networks model,the Precision of this model on the two data sets is improved by 2.29% and 4.75% respectively.(2)Aiming at the problems of insufficient feature extraction and low classification precision of Chinese news text,this thesis further studies the characteristics of Convolutional Neural Networks and proposes a Chinese text classification model(TGNet)that integrates Temporal Convolutional Networks and Gated Recurrent Unit.The Temporal Convolutional layer,Batch Normalization layer and optimized Activation layer were combined to construct a Temporal Convolutional Networks structure to capture the relationship between hidden features at different time scales.At the same time,Gated Recurrent Unit is used to focus on the text features of the context.After that,the above extracted text features are fused and then input into Softmax for classification.The proposed model is compared with five text classification models,and the Chinese news text classification experiments are carried out on Sogou CS dataset and Fu Dan news dataset.The experimental results show that the macro average precision rate of the evaluation index of this model reaches 96.57% and 93.35%respectively,and the macro average recall and macro average F1-Score value of this model are better than the selected comparison model.In summary,this thesis constructs two kinds of mixed Chinese text classification models based on Convolutional Neural Networks.The first model is to pay more attention to the relationship between local features to better extract global feature information for research,the other model is to extract text features from many aspects to improve the classification effect for in-depth research,and finally both reached the better classification effect.
Keywords/Search Tags:Text Classification, Convolutional Neural Networks, Squeeze and Excitation block, Attentional Mechanism, Temporal Convolutional Networks
PDF Full Text Request
Related items