| Text sentiment analysis is an essential task in the field of natural language processing.With the continuous development of the Internet,a large amount of text data has appeared on the network,which carries subjective information such as people’s opinions and emotions,and the sentiment analysis can provide support for enterprises and governments’ decision-making.In this thesis,by using text data such as comments on the Internet as the research object,it is found that the sarcastic rhetorical devices used in commentary texts will affect the results of sentiment analysis.After analyzing the shortcomings of the existing methods of sarcastic text recognition and sentiment analysis,the existing model is improved and innovated.The main research work is as follows:1.In terms of sarcastic text recognition,based on the problem that the existing sarcastic text recognition methods lack attention to the text theme and ignore the phenomenon of opposite lexical affective polarity appearing in sarcastic sentences,an sarcastic text recognition method based on topic model and interlexical attention score is studied.Firstly,the word vector embedding is performed on the text using pre-trained word2 vec to obtain vector representation of text feature terms.A multi-channel vector processing module is constructed by combining the theme model,the interlexical attention score with the bidirectional long and short-term memory network,which compensates for the shortcomings of the sarcastic text recognition method in neglecting text themes and focuses on the current phenomenon that words have opposite affective polarity to other words.Then,the attention mechanism is introduced to highlight the features of sarcasm recognition in vector processing module.Finally,the experiments are conducted on two Chinese datasets and two English sarcastic text datasets,respectively Riloff and Ptacek.And the results show that the method proposed in this thesis outperforms existing sarcastic text recognition methods.2.In the aspect of text sentiment analysis,an improved Caps Net sentiment analysis method based on position features is proposed,which aims at the shortcomings of insufficient feature extraction,the ignorance of text position features and not fully utilizing multiple grammatical features of text.It consists of four main components:word embedding layer,multi-scale feature fusion layer,improved k-means capsule layer,and sentiment classification layer.The word embedding layer combines text features with text location features to enrich the input of word embedding layer;Multiscale feature fusion layer uses convolutional kernels of different scales to obtain multivariate grammatical features of the text,which compensates for the short text feature vocabulary deficiency;The improved k-means capsule layer uses a statistical formula to find the densest point in the data set to assign initial values in order to reduce the effect of isolated points in the data set;The sentiment classification layer is used to discriminate affective tendencies.In addition,the recognition method of sarcastic text in Chapter 3 is used as the discriminant condition of sentiment analysis.Experimental analysis on Restaurant and Laptop data sets shows that the model proposed in this thesis is superior to other existing models in sentiment analysis tasks. |