Font Size: a A A

The Research Of Text Sequences Analysis Methods And Its Sentiment Deduction Using Deep Learning

Posted on:2021-02-09Degree:MasterType:Thesis
Country:ChinaCandidate:Q X ZhuFull Text:PDF
GTID:2428330626455685Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Text sequences are abstract descriptions of semantics and text meanings.Any emotion expression based on natural language(such as movie reviews and product reviews)can be abstracted into some kind of random or random-like text sequence.Therefore,in the field of natural language processing,various processes for text sequences,the extraction of sentiment features and the inference of sentiment types in text sequences have become important and challenging topics in artificial intelligence research.This thesis has carried out research on text sequence analysis and emotional semantic inference from two aspects.One is to study how to build a sequence model of random text sequences,and the other is to study how to analyze the emotional characteristics of random text sequences and design new methods by means of convolutional neural networks.This thesis first conducts a comparative analysis of various basic models of random text sequences(such as n-gram,word2 vec,CBOW,etc.),and then points out that the naive Bayes model,the support vector machine model,and the maximum entropy model are suitable for the working model of text sequences on sentiment specific analysis and inference research.Then,this thesis makes a deeper exploration on the convolutional neural network(Text-CNN)method for sentiment analysis of text sequences.It is pointed out that the advantage of Text-CNN in processing text sequences is that it can perform feature extraction and dimensionality reduction upon the input samples,but the pooling operation of the pooling layer will cause the loss of information of the input sample data,and so the length of the feature output cannot be determined.For this reason,this research uses a method of spatial pyramid pooling(SPP)to try to solve the above two problems.In addition,the fully-connected neural network layer in the Text-CNN method was replaced with the Long Short Term Memory(LSTM)neural network layer based on the effectiveness of using the LSTM neural network model in time sequences processing.Finally,this thesis improves the Text-CNN method and thus obtains the SPP-CNN-LSTM method.Using this method,the simulations based on the IMDB and SST datasets are compared with four benchmark algorithm models(CNN,LSTM,SPP-CNN,and CNN-LSTM models).The results show that SPP-CNN-LSTM method improves the accuracy of emotion classification by 3% ~ 7% compared with the other four methods.Furthermore,this paper also analyzes and points out that the performance of CNN,LSTM,SPP-CNN,CNN-LSTM and SPP-CNN-LSTM models in dynamic skip-gram(DSG)word vector training mode is better than other word vector training modes.In the DSG mode,the text sentiment classification performance of the experimental model increases sequentially,and the average accuracy rates on the IMDB and SST datasets are about 73.2%,75.1%,78.3%,79.6%,and 83.0% respectively.
Keywords/Search Tags:text sequence, sentiment analysis, word vectors, convolutional neural networks(CNN), long-short term memory(LSTM) model
PDF Full Text Request
Related items