Font Size: a A A

Research On Sarcasm Recognition In Chinese Text

Posted on:2021-03-26Degree:MasterType:Thesis
Country:ChinaCandidate:X C GongFull Text:PDF
GTID:2428330611499753Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years,people have become more and more used to sharing their personal sentiments and opinions on social media.Therefore,how to accurately identify and mine the sentiment from text has become a hot research topic.Sarcasm is a kind of special linguistic phenomenon.When users use sarcasms to express their opinions,there is often an opposite between the real feelings and literal feelings.Sarcasm widely exists in the social media,thus the accurate recognition of sarcasm expression is helpful to improve the performance of sentiment analysis systems.The existing sarcasm recognition methods mainly camped into rule-based,statistical machine learning based,and deep neural network based approaches.Rule-based approach requires a lot of time and manpower to compile the rules.Statistical machine learning based approach requires a lot of manual feature selection.The deep neural network based approach reduces the dependence on artificial feature extraction,but it has shown strong dependence on the scale and quality of training dataset.Sarcasm text data set is the basis of sarcasm recognition.In view of the lack of high-quality Chinese sarcasm annotated data,in this study,we collates and annotates the user comments on news websites,and constructs the largest Chinese sarcasm text data set.This dataset contains 2486 sarcasm texts and 89296 non-sarcasm texts.In order to facilitate the subsequent research on sarcasm recognition,2486 non-sarcasm texts are sampled and then combined with the same number of sarcasm text to construct a Chinese sarcasm recognition dataset with balanced samples.Aiming at the problem that the performance of traditional neural network model highly relies on large-scale and high-quality annotation data,we investigates an adversarial learning framework based on adversarial samples to improve the performance of sarcasm recognition based on deep neural network.The experimental results show that the sarcasm recognition method based on convolutional neural network and long-term memory network improves the accuracy and F1 value by nearly 2%.This result means that the proposed adversarial training framework enhance the generalization ability and robustness of the sarcasm recognition effectively.The pre-train language model provides a new paradigm for Natural Language Processing task,which unsupervised trains the same model on large-scale corpus and thenfine tunes on downstream tasks.It alleviates the dependence of traditional neural network models to large-scale annotation data.Thus,we investigates the sarcasm recognition method based on pre-train language model.Experimental results show that the performance of sarcasm recognition based on pre-train language model obviously outperforms the one based on convolutional neural network and memory network.The results show that the pre-train language model based on large-scale parameter learning enhances the deep semantic representation learning.Finally,the sarcasm recognition method,which incorporates the adversarial learning framework and Ro BERTa(robustly optimized Bert approach)model,achieves the highest performance of 0.7843 accuracy and 0.7866 F1 value on our developed sarcasm recognition dataset.
Keywords/Search Tags:sarcasm recognition, sarcasm text dataset, adversarial example, pre-trained language model
PDF Full Text Request
Related items