Font Size: a A A

Research On Chinese Text Classification Algorithm Based On Deep Learning

Posted on:2020-02-17Degree:MasterType:Thesis
Country:ChinaCandidate:D ZhangFull Text:PDF
GTID:2428330590459391Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Text classification is a typical and strong fundamentality research field.The traditional methods obtained feature information relaying on artificial means after the texts classify.Deep learning has replaced traditional methods,which integrated feature extraction and classification,and can automatically gained characteristic information,especially widely used in data mining,natural language processing,computer vision.News text classification is an irreplaceable technology in natural language processing.It is the best means for people to get the latest news and understand current events.It is of great significance and value to use effective methods to realize news text classification.This paper deeply analysed the disadvantages of traditional text classification methods.It focused on the application of deep learning method of convolution neural network(CNN)and recurrent neural network(RNN)in news text classification.The main contents of the thesis were as follows:(1)In light of the fact that the TCNN model can not fully obtain local features and keyword information of text,and RCNN model can not fully extract feature information of text context in text classification.Text classification algorithms of TC-AM model and GCNN model were proposed in this paper.TC-AM model used a three-layer convolution pooling operation to obtain local features of the text,that introduced a dual-channel attention mechanism(DAM)to make the features of the text more representative,and assigned corresponding weights to text information to obtain keyword information.In the GCNN model,dual-channel forward and backward bidirectional gated recurrent units(DFB-GRU)was adopted to fully obtain the text context feature information.Tests on sogou news corp,us showed that the two improved models have better classification effect,which improved the accuracy rate,precision,recall rate and F1 value of classification.(2)In view of the fact that the TC-AM l,model can only obtain local text feature information and keyword information,GCNN model can only obtain text context feature information,A text classification algorithm was proposed on deep learning fusion model(TGC-AM),which combined the advantages of TC-AM and GCNN,so that it can not only fully obtain the local features information of the text,but also fully get the context information of the text.what's more,it can better represent the text features and extract the keyword information.Tests on sogou news corpus showed that the the fusion model has an outstanding classification effect and effectively improves the classification accuracy rate,precision,recall rate and F1 value.
Keywords/Search Tags:Text Classification, Deep Learning, Convolutional Neural Network, Threshold Cycle Unit, Attention Mechanism
PDF Full Text Request
Related items