Font Size: a A A

Research On Short Text Classification Based On Deep Neural Network

Posted on:2022-06-20Degree:MasterType:Thesis
Country:ChinaCandidate:L KangFull Text:PDF
GTID:2518306491477104Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
With the continuous development of cloud computing and big data technology,while the Internet brings convenience to people’s lives,a large number of text data are produced every day,the short text data such as shopping reviews,takeout reviews,microblogs and online communication et al.How to classify these short texts by machine and mine the necessary information has become one of the hot issues in many areas.However,due to the sparsity of short text data and the lack of semantic features,short text classification becomes very difficult.Therefore,this thesis proposes the word embedding method to alleviate the problem of short text data feature shortage,and builds a hybrid neural network model to extract text semantic features,which solves the bottleneck of short text data classification to a certain extent and improves the performance of short text classification.In addition,this paper studies the classification performance of word2 vec,Glove and one-hot in short text.The specific work of this paper is as follows:1.This paper studies the methods of text preprocessing,including feature extraction,word vector method.This paper focuses on the word embedding methods,including cbow,skip gram model in word2 vec and glove model.With the help of these methods,the quality of the text can be improved,and the text can be transformed into the form of word vector which can be read directly by the computer.2.According to the characteristics of short text,a neural network model based on attention mechanism is proposed.AB-LSTM and AB-GRU models are built by combining attention mechanism and Bi LSTM(Bi GRU),and the algorithm flow is designed.At the same time,a series of experiments verify the performance of word2 vec,Glove and one-hot in the model,and the classification effect is remarkable.3.Based on AB-LSTM model,CNN method is introduced into short text classification to build CAB-LSTM model,where the effectiveness is verified significantly on the short text data set.At the same time,we compare the performance of word2 vec,Glove and one-hot in the model,and find that CAB-LSTM model combined with one-hot is better.
Keywords/Search Tags:short text classification, LSTM, Attention mechanism, CNN, word vector
PDF Full Text Request
Related items