Font Size: a A A

Research On Sentence Classification Based On Deep Learning And Feature Embedding

Posted on:2019-01-14Degree:MasterType:Thesis
Country:ChinaCandidate:S J YangFull Text:PDF
GTID:2428330563493359Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The rapid development of the Internet has brought a dramatic increase in online text data,and deep neural network models can express data features well and are therefore widely used in text classification tasks.The most successful ones are the Convolutional Neural Network(CNN)and the Long Short-Term Memory(LSTM).However,its performance has been affected by the issue of polysemy and isomorphism.Therefore,it is of great significance to study how to alleviate the ambiguity of words in order to improve the classification performance of deep learning models.Through a series of experiments,the influence of short text data sparsity on classification performance is analyzed and the semantic coherence of thematic features is measured.Aimed at the short text dataset,combined with the idea of fusion continuous semantic information and discrete topic information,uses the global information provided by the topic features to modify the two granularity features of "word" and "sentence",this paper proposes a sentence classification method based on topic embeddings with Convolutional Neural Network and a sentence classification method based on topic embeddings with Long-Short Term Memory network respectively.Aiming at the serialization of the short text and the logic correlation before and after,the Bi-directional LSTM is used to capture the full contextual information,the word embeddings and character embeddings are fused on the input layer,and embedded features between the Bi-directional LSTM layers,this paper proposes a sentence classification method based on multi-connection Bi-directional LSTM.Classification performance comparison and feature fusion analysis experiments are conducted on the public real short text datasets,the experimental results demonstrate that the proposed models outperform the existing baseline methods in terms of accuracy,which means that the integration of topic features into the deep learning models can effectively alleviate the polysemy problem of words,thus better represent the semantics of short text and improve the classification performance.
Keywords/Search Tags:Natural Language Processing, Deep Learning, Sentence Classification, Topical Word Embeddings, Biterm Topic Model
PDF Full Text Request
Related items