Font Size: a A A

Research On Text Classification Model Based On Deep Neural Network

Posted on:2021-03-20Degree:MasterType:Thesis
Country:ChinaCandidate:T L ChenFull Text:PDF
GTID:2428330602479026Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Due to the rapid development of China's information technology,Internet products have flooded all parts of the country,leading to an explosion of information.The organization and classification of information becomes more important,because massive information is accompanied by disorder and clutter.Text classification technology came into being,which is designed to classify text quickly and accurately.Text classification is one of the most basic tasks in the field of natural language processing.Text classification has entered the era of deep neural network.Deep neural networks can overcome the problems of matrix sparsity,dimensional explosion,and difficulty in expressing semantics in traditional text representation.It also can make up the disadvantages of classical text classification methods in feature extraction.This article mainly studies the related technologies of subject classification of Chinese documents,deeply explores the application of various deep neural networks in text classification,and designs and implements a text classification model suitable for document subject classification based on this.The main research contents of this article include the following:In view of the difference in text semantics between corpora in different fields,this paper selects corpus data in the same domain as the experimental data set as the corpus content of the training word vector to ensure semantic consistency to the greatest extent.At the same time,the current popular neural language model Word2Vec is used to obtain continuous and dense word vectors as a basis for subsequent tasks.To solve the problem of how to implement recurrent neural network to model the phrase better.This paper uses a convolution algorithm to extract the corresponding n-gram features;secondly,it uses a bidirectional recurrent neural network that can overcome the semantic bias problem of the unidirectional recurrent neural network to extract features that contain complete contextual semantic information;finally,use the attention mechanism that can capture the internal correlation features of the text to integrate the features finally.Aiming at the long short-term memory and gated recurrent unit in the recurrent neural network,it will bring too much time loss.This paper uses a new recurrent neural network structure-simple recurrent unit,which can not only maintain excellent results,but also greatly reduce time loss.This paper mainly proposes a text classification model based on hybrid deep neural network(Conv-BSA)and verifies it by experiments.From the analysis of experimental results,the model Conv-BSA can stand out in multiple models,which confirms the effectiveness of the model Conv-BSA.
Keywords/Search Tags:text classification, word vector, convolutional neural network, recurrent neural network, attention mechanism
PDF Full Text Request
Related items