Research On Text Classification Method Based On Bidirectional LSTM

Posted on:2020-06-28

Degree:Master

Type:Thesis

Country:China

Candidate:J B Guo

Full Text:PDF

GTID:2428330596974943

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

With the continuous development of the Internet and multimedia technology,a large amount of text data is constantly updated and alternated.Text classification has grown rapidly as a key technology for processing and analyzing large amounts of text data.Text classification mainly includes topic classification,question classification and sentiment analysis.Each category has its own classification criteria and characteristics,so it is difficult to find a way to handle all types of text classification problems.Many existing traditional text categorization methods ignore the association between words and do not adequately extract the semantic information hidden in the context of the text.Deep learning technology has achieved remarkable results in the many fields of nature language processing.For sequence model such as natural language,the deep neural network model has its unique advantages.On the basis of summarizing the traditional text feature extraction and classification methods,this thesis makes a study on using the deep neural network model to solve the text classification problems.The proposed architecture is called attention-based bidirectional long short-term memory with convolution layer.The main work and innovation of this dissertation are as follows:(1)Research on word embedding techniques and convolution operations.Word embedding technology can map words in natural language to a low-dimensional real number vector through neural network,which effectively avoids the shortcomings of traditional word vectors lacking semantic information.The convolutional layer added later can extract semantic features in parallel,reduce the data vector dimension,and reduce the input parameters of subsequent structures.(2)Based on the LSTM-based sequence information coding and decoding model,a strategy combining attention mechanism and bidirectional LSTM is proposed to solve the problem of text information feature extraction,and further improve the performance of the classification model.The bidirectional LSTM that integrates the attention mechanism gives different weights to the state of each moment.It can avoid the redundancy of information while preserving the basis of effective information,and improve the effect of text classification by optimizing the text feature vector.(3)In order to verify the validity of the deep learning model proposed in this paper,comparative experiments were conducted on seven common standard data sets.The experimental results show that the model designed for the above improved method has a certain improvement compared with the original model.

Keywords/Search Tags:

Feature Extraction, Text Classification, LSTM model, Attention Mechanism, Word Embedding, Word2Vec

PDF Full Text Request

Related items

1	Research On Text Classification Based On Attention Bi-LSTM
2	Research Of Text Classification Based On Word2vec And Self-attention
3	Research On The Method Of Text Feature Extraction
4	Text Classification Research Based On Deep Neural Network And Attention Mechanism
5	Text Classification Based On Attention-Based LSTM Model
6	Improvement And Application Of Text Classification Based On RNN
7	Research On Short Text Aspect Extraction Base On Topic Model And Word Embedding Mechanism
8	Research And Improvement On Text Classfication Based On Word Embedding
9	Research On Text Similarity Recognition Based On LSTM
10	Research On Short Text Similarity Algorithm Based On BiLSTM And Attention Mechanism