Font Size: a A A

Research On Text Sentiment Analysis Based On Deep Learning

Posted on:2020-09-13Degree:MasterType:Thesis
Country:ChinaCandidate:Z H ChenFull Text:PDF
GTID:2438330620455599Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet and social media,users have generated a large amount of comments with complex sentiment tendencies.Companies,institutions or individuals wish to integrate these subjective comments to analyze and track the perceptions of public opinion towards a certain object or event.The rapid development of digital media has driven the study of machine-assisted text analysis,within which sentiment analysis is a popular topic with widespread concern.Following the traditional bag-of-words model and canonical machine learning method,word embedding has become the first choice for text representation.Various convolution,recurrent and recursive neural networks hold the dominant position in this field.This paper is based on the classical convolutional neural network Kim-CNN which consists of word embedding,two-dimensional convolution and maximum pooling.We carried out a large number of experiments on each of its components,including convolution kernel,pooling method,recurrent layer and attention mechanism.In addition,we analyzed the advantages,disadvantages and combination principles of these structures.Two important improvements have been tested:(1)one-dimensional convolution kernel instead of two-dimensional convolution structure reduces information loss in the feature extraction process;(2)adding recurrent layers to the convolutional network alleviates the problem that convolution cannot effectively identify and express complete temporal information.Some experiments show that the revised 1D-2D-LSTM model achieves a maximum accuracy improvement of 3.1%on the SST-5 dataset,and it is state-of-the-art in similar neural network architectures.In addition,we also analyzed the effects of network structure and parameter selection,such as pooling strategy,embedding fine-tuning and attention mechanism.Since Kim-CNN was proposed in 2014,a large number of studies have used convolution-recurrent-pooling neural network frameworks.As the complexity of the network increases,some combinations show moderate improvement in classification accuracy,while no sign of major breakthrough has been observed.Is there still room for major improvement beyond the current network framework?Based on this question,this paper extends the scope and analyzes the effects of the following revisions:(1)interpretable feature combination methods;(2)additional linguistic features;(3)dynamic word embedding(e.g,BERT)instead of static word embedding(Word2Vec etc.).Preliminary experiments show that they have great potential in breaking the current performance bottleneck.With the above experiments and analysis,we claim that major breakthrough in the sentiment analysis task requires improvements on:(1)Word representation and feature extraction.Classical neural network models use word embedding as the word representation,and convolution,recurrent,recursive networks as the feature extractor.However,the research in recent years has fallen into a bottleneck.Recently,the BERT model has made huge breakthrough in many benchmarks.It has changed the classical pre-training mode,migrated word representation and feature extraction network to downstream tasks at the same time,and improved the ability of feature expression to a certain extent.(2)Interpretable downsampling.The parameters of the neural network are huge.In order to improve the operation speed,it is necessary to downsample the features to reduce the network parameters.In order to alleviate the loss of text information in downsampling,a pooling method that can be interpreted at the physical level is urgently needed in the field of sentiment analysis.
Keywords/Search Tags:sentiment analysis, convolutional neural network(CNN), recurrent neural network(RNN), word embedding, deep learning
PDF Full Text Request
Related items