Font Size: a A A

Research On Text Classification Method Based On Bidirectional LSTM

Posted on:2020-06-28Degree:MasterType:Thesis
Country:ChinaCandidate:J B GuoFull Text:PDF
GTID:2428330596974943Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the continuous development of the Internet and multimedia technology,a large amount of text data is constantly updated and alternated.Text classification has grown rapidly as a key technology for processing and analyzing large amounts of text data.Text classification mainly includes topic classification,question classification and sentiment analysis.Each category has its own classification criteria and characteristics,so it is difficult to find a way to handle all types of text classification problems.Many existing traditional text categorization methods ignore the association between words and do not adequately extract the semantic information hidden in the context of the text.Deep learning technology has achieved remarkable results in the many fields of nature language processing.For sequence model such as natural language,the deep neural network model has its unique advantages.On the basis of summarizing the traditional text feature extraction and classification methods,this thesis makes a study on using the deep neural network model to solve the text classification problems.The proposed architecture is called attention-based bidirectional long short-term memory with convolution layer.The main work and innovation of this dissertation are as follows:(1)Research on word embedding techniques and convolution operations.Word embedding technology can map words in natural language to a low-dimensional real number vector through neural network,which effectively avoids the shortcomings of traditional word vectors lacking semantic information.The convolutional layer added later can extract semantic features in parallel,reduce the data vector dimension,and reduce the input parameters of subsequent structures.(2)Based on the LSTM-based sequence information coding and decoding model,a strategy combining attention mechanism and bidirectional LSTM is proposed to solve the problem of text information feature extraction,and further improve the performance of the classification model.The bidirectional LSTM that integrates the attention mechanism gives different weights to the state of each moment.It can avoid the redundancy of information while preserving the basis of effective information,and improve the effect of text classification by optimizing the text feature vector.(3)In order to verify the validity of the deep learning model proposed in this paper,comparative experiments were conducted on seven common standard data sets.The experimental results show that the model designed for the above improved method has a certain improvement compared with the original model.
Keywords/Search Tags:Feature Extraction, Text Classification, LSTM model, Attention Mechanism, Word Embedding, Word2Vec
PDF Full Text Request
Related items