Font Size: a A A

A Pattern-based Approach For Chinese Irony And Sarcasm Detection

Posted on:2021-03-04Degree:MasterType:Thesis
Country:ChinaCandidate:L J YanFull Text:PDF
GTID:2428330626459488Subject:Foreign Linguistics and Applied Linguistics
Abstract/Summary:PDF Full Text Request
This dissertation reports on a study,where a pattern-based approach is put forward to realize the automated detection of irony and sarcasm.The main aim of this study is to verify whether a set of features,which can reflect the discriminating characteristic of irony and sarcasm,can be designed according to linguistic theories about irony and sarcasm,and to verify the effect of these linguistic features designed in this study on the performance of trained model when the model is applied to the automated detection of irony and sarcasm.In order to achieve the main aims,this study firstly designed a set of linguistic features,including pattern-based feature,sentiment feature and degree word based on linguistic theories on irony.And then,a set of statistical features are chosen to contrast with the effect of linguistic features.Thirdly,values of these two kinds of features are extracted from two corpora,a balanced one and an imbalanced one.Then,values of features obtained from each corpus are input to three SVM learning algorithms,one with linguistic features,one with statistical features and one with all of these features.Next,these SVM learning algorithms trained 6 different models,which are tools being able to detect irony and sarcasm essentially,by using these feature values.Finally,by observing the performance of these trained models,conclusion can be drawn accordingly.The major findings of this study are as follows: 1)The linguistic features designed in this study can improve the performance of models significantly.2)The performance of models trained with all features are better than the baselines of first kind----results in previous works.3)The performance of models trained with linguistic features and statistical features is better than the baselines of first kind----‘choose the most' line,independently.It shows that the statistical frequently used in the detection of irony and sarcasm of foreign languages can also be useful in Chinese irony and sarcasm detection,though,not as useful as the linguistic features.4)The values of precision,recall and F1 indicate that these models perform better on balanced corpus than on imbalanced corpus,which accorded with the findings in previous work.The conclusion of this study is as follows: 1)The linguistic theories on irony can be successfully transformed into linguistic features,and these linguistic features are useful for the training of models,which can distinguish irony,sarcasm from other nonironic and non-sarcastic texts automatically.2)The effect of linguistic features is greater than statistical features.This indicates that linguistic features are promising at a time when the statistical features are dominant in the training process of models.3)To make models perform better on imbalanced corpus,one need to adjust the threshold of SVM algorithm.
Keywords/Search Tags:pattern-based approach, linguistic features, machine learning, irony and sarcasm detection
PDF Full Text Request
Related items