Font Size: a A A

Research On Sentence Classification Model Based On Deep Feature Extraction

Posted on:2020-04-28Degree:MasterType:Thesis
Country:ChinaCandidate:B Z GuoFull Text:PDF
GTID:2428330575977784Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the vigorous development of Internet,massive text information is emerging,which is of great value in news information,electronic commerce,public opinion monitoring and other scenarios.Classification is an important technical means for effectively utilizing text information and solving the requirements of the practical scenarios mentioned above.As the objects of classification in the above scenarios,the text information often appears in the form of a single sentence or short sentences composed of several words.Therefore,it is particularly important to construct a good sentence classification model.The research on sentence classification has important application value.Because sentences generally have the characteristics of short length,more new words and less repetitive components,sentence classification methods based on statistical learning usually require a lot of energy to extract and select features according to the characteristics of sentences in specific classification tasks.Meanwhile,features need to be reconstructed for new scenarios,so the universality of the traditional methods is poor.The above shortcomings limit the applications of the traditional methods in sentence classification to some extent.The applications of deep learning in natural language processing have promoted the research on sentence classification.Recent research has shown that convolutional neural networks can be effectively applied to sentence classification through word embeddings.Although convolutional neural networks used for sentence classification can extract local features in sentences,it neglects that different words have different importance to the results of classification and correlative information usually exists between words of different parts in a sentence of a specific classification task.In addition,the word embedding of each word is restricted by a single training method.All the above aspects affect the final extracted features for sentence classification.This thesis conducts in-depth research on the problems mentioned above.The main contents are as follows:1)A novel sentence classification model is proposed based on convolutional and recurrent neural networks with enhanced semantic feature extraction.Firstly,the proposed model constructs convolution kernels with semantic features by selecting important word sequences in each class of a training set,so as to enhance the semantic feature extraction of the word sequences which are important to classification results.Next,the model extracts local features of a sentence by convolution and local pooling against a matrix of word embeddings and the sequentiality of the sentence is retained.Then the model takes the local features as the input of recurrent neural networks to obtain long-distance dependent information in the sentence so as to get global features of the sentence.Finally,the result of classification is obtained through full connection layer and Softmax function.The model strengthens the ability of extracting semantic features,and combines the advantages of convolutional neural networks and recurrent neural networks.2)Novel sentence classification models are proposed based on double neural networks with enhanced semantic feature extraction.In view of the point that the word embedding of each word is restricted by a single training method,on the basis of enhanced semantic feature extraction,a novel sentence classification model based on double convolutional and recurrent neural networks is constructed in which the word embeddings obtained by different training methods are used as input together.Meanwhile,a novel sentence classification model based on double convolutional neural networks is proposed for comparison.The proposed models get more abundant sentence features with effective use of different kinds of word embeddings.In this thesis,the proposed models are tested on several open datasets and compared with several existing models for sentence classification.The experimental results demonstrate that the proposed models achieve competitive performance in different classification tasks such as sentence-level sentiment classification and problem classification at sentence level.
Keywords/Search Tags:word embedding, feature extraction, convolutional neural networks, recurrent neural networks, sentence classification
PDF Full Text Request
Related items