Font Size: a A A

Research On Text Classification Problem Based On Deep Learning

Posted on:2020-10-31Degree:MasterType:Thesis
Country:ChinaCandidate:C L PanFull Text:PDF
GTID:2428330599958569Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years,the explosion of data volume,the continuous improvement of algorithms,and the continuous innovation of hardware have made the development of deep learning theory and application rapid.Among the many problems in the field of natural language processing,the use of deep learning methods is superior to traditional methods.Text categorization is one of the important applications of natural language processing and has been extensively studied for many years.Text representation and selection of neural networks are key steps in the use of deep learning methods to solve text classification problems,and have a decisive influence on text classification effects.At present,the mainstream text representation methods include OneHot and word embedding.Commonly used neural networks include standard neural networks,convolutional neural networks,and recurrent neural networks.In order to explore the influence of the above factors,a number of experiments were designed for the two real scenes of English SMS filtering and Chinese news headline classification,and the effects and efficiency of the above mainstream technology combinations were compared.The experimental results show that the text representation of word embedding is more suitable for text classification than One-Hot.For example,in the English SMS filtering scenario,the accuracy of word embedding is up to 6.84% higher than One-Hot.In text classification,recurrent neural networks are superior to convolutional neural networks,and convolutional neural networks are superior to standard neural networks.For example,in the Chinese news headline classification scenario,using the word embedding text representation method,the accuracy of the recurrent neural network is 2% higher than that of the convolutional neural network,and the convolutional neural network is 2% higher than the standard neural network.In addition,compared to One-Hot,word embedding has a smaller memory overhead,and is more suitable for dealing with large text classification problems in data sets when resources are limited.When using standard neural networks,simple summation can ensure accuracy and computational overhead are better than averaging.
Keywords/Search Tags:Deep Learning, Text Classification, Text Representation, Standard Neural Network, Convolutional Neural Network, Recurrent Neural Network
PDF Full Text Request
Related items