Font Size: a A A

Research On Sentiment Analysis Based On Text Data Augmentation And Hybrid Model

Posted on:2019-07-02Degree:MasterType:Thesis
Country:ChinaCandidate:J J HeFull Text:PDF
GTID:2428330548491209Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Chinese Sentiment Analysis is a challenging task in Natural Language Processing(NLP)or text data mining problem,due to the complexity of Chinese syntactic structure,it is difficult to obtain a general model or feature to deal with all the sentiment analysis tasks.In addition,compared with foreign countries,the study of Chinese sentiment analysis is relatively late,there is a lack of a complete and high-quality experimental corpus.The purpose of this thesis is building a hybrid neural network model to effectively learn multi feature and improving its generalization ability on sentiment analysis tasks.In view of the limited volume of the Chinese data set which is well tagged at present,exploring the problem of overfitting in the deep neural network model.This thesis explores and researches the problem of sentiment analysis,which based on text-oriented data augmentation method and hybrid neural network model.The main work of the thesis is as follows:A text oriented multi-granularity data augmentation mechanism is designed,starting from the characteristics of subjective evaluation text in Chinese.This thesis explores the text data augmentation method of multi-granularity(word level,phrase level,sentence level),and compare with the generative model of the popular Genrative Adversarial Network(GAN).Experiments show that the data augmentation method in this thesis can effectively generate larger text data for sentiment analysis model to learn its distributed representation based on the original dataset.In order to explore and analyze the effect of the deep neural network model in the data augmentation method described in this thesis,this thesis compares the Convolutional Neural Network(CNN)and Long Short Term Memory(LSTM),and based on this,the thesis proposed a feature fusion model based on CNN and LSTM.The model combines the local feature extraction ability of Convolutional Neural Network Model and the advantage of sequence data processing from LSTM,concatenates their hidden layer features together as the high-level features for the hybrid neural network model.The effectiveness of the model is verified by the performance of the actual task.This thesis uses the public hotel evaluation corpus for experimental and research datasets.Exploring the task of sentiment analysis based on the proposed text-oriented data augmentation method and feature-level fusion model.Experimental results show that the proposed method and model performs better than the baseline method and model on the original dataset.Meanwhile,the method presented in this thesis has achieved good performance on the task of cross domain prediction and validates the gain effect of the data augmentation mechanism on the generalization performance for deep neural network model.
Keywords/Search Tags:Text Sentiment Analysis, Text Data Augmentation, Hybrid Neural Network Model, Model Generalization
PDF Full Text Request
Related items