Font Size: a A A

Research On Short Text Sentiment Classification Technology

Posted on:2020-03-01Degree:MasterType:Thesis
Country:ChinaCandidate:H ZhangFull Text:PDF
GTID:2428330623962148Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet technology and high-speed mobile networks,a huge variety of network data has been generated.One of the most frequently involved and used data in our daily life is the large amount of short text data in the Internet.This type of data usually contains people's emotional preferences and subjective opinions about things,goods,and events.Whether it is an individual,a business,an organization,or a government department,this type of short text data is gradually being used to assist in the formulation of various decisions.Therefore,the sentiment analysis of short texts on the network has great academic significance and application value.In this thesis,a series of studies are carried out on the problems of short text sentiment classification.The main work and contributions include the following:1)Aiming at solving the difficulty of extracting key sentiment features in short texts,a text feature extraction method based on Word2 vec combined with PCA(Principal Components Analysis)is proposed.First,we use the Word2 vec word embedding tool to train the general word vector of the text in the large general corpus.Further,a word vector matrix of text is obtained,and each column in the text word vector matrix is summed and then averaged to obtain a feature vector of the text.Then,the obtained text feature vector is submitted to PCA dimensionality reduction,and the dimensional reduced text feature is taken as the final feature of the text.Finally,the feature is used as the input of the SVM(Support Vector Machine)classifier to perform text sentiment orientation analysis.Experiments verify the effectiveness of the method.2)For deep learning,the text representation usually only contains the general features of words,and does not contain the domain-related features of words.A BiLSTM-based feature extension and integrated text sentiment classification method is proposed.First,we obtain the domain related corpus of the text by adding a certain size of the external domain related corpus to the training corpus.Then use the Word2 vec tool to get the domain-related word vector of the text and the domain-related text features.Then combine the common features of the text with the domain-related features to get the integrated features of the text.Finally,we capture the deep semantics of textual integration features and judge the sentiment tendencies of text through a two-layer BiLSTM network.The experimental results show that the method based on feature expansion and integration can further improve the classification effect of the classifier.3)In order to combine the advantages of Convolutional Neural Networks(CNN)and Recurrent Neural Network(RNN),a hybrid network text sentiment classification method based on CNN-RNN is proposed.The method obtains the local features and context features of the text through CNN and BiLSTM respectively,and then combine the two features into the final text features.On this basis,the LSTM(Long-Short Term Memory)neural network is used to capture the deep text sentiment semantics represented by the text features,and finally the sentiment tendencies of the text are judged.Experiments show that by establishing a hybrid neural network model and combining the text features of the two network models,the effect of text sentiment classification can be effectively improved.
Keywords/Search Tags:Short Text, Sentiment Classification, Deep Learning, Word Embedding, Feature Integration
PDF Full Text Request
Related items