Font Size: a A A

Research On The Construction Of Chinese Multimodal Argumentative Corpus

Posted on:2022-12-11Degree:MasterType:Thesis
Country:ChinaCandidate:C ZhangFull Text:PDF
GTID:2518306743487294Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Argument mining is an important task in natural language processing,as a branch of sentiment computing,whose task is to automatically extract arguments from unstructured text documents in order to provide structured data for machine learning and deep learning models.It has recently become a hot topic because of its potential to process information from the web in innovative ways,especially for processing information from social media.In the face of the severe shortage of annotated corpora for training supervised learning algorithms in the field of argument mining,a reliable annotated corpus of Chinese argument structures is created.We also consider the close connection between the direction of sentiment computing and the task of argument mining,so that the dataset can also be used for sentiment classification while oriented to argument mining.The main contents of this research paper are divided into the following.(1)In order to solve the current problem of scarcity of Chinese argument corpus,an annotation system containing argument elements,sentiment polarity,and sentiment categories is given.This paper selects the program "QI PA SHUO" as the text of the debate,which has the advantages of strong logical structure,rich argumentative information and long-term research value,and sets up an annotation system based on the debate components,emotional polarity,emotional categories and rhetorical techniques.The annotation scheme is pre-labeled by the identified annotation scheme,and the consistency test of the experimental results proves that the annotation scheme is feasible.(2)In order to meet the demand for multimodal data in argument mining for Chinese language environment and to consider the scarcity of multimodal corpus in Chinese language environment,the construction method of Chinese multimodal argument corpus is given.When there are multiple such modalities in the corpus and research questions,such as video signals and audio signals,this feature is classified as a multimodal signal.Considering that audio signals and video signals are received along with text information,audio and video are cut,spliced and transcribed with the help of audio and video processing tools.And its effectiveness is proved by the quality evaluation of the text.(3)Quality assessment of the Chinese multimodal argument corpus is conducted.In this paper,the annotation quality is evaluated using four consistency testing algorithms,and the results show that a high level of consistency is achieved among the annotators.The results of the consistency check show that all tags are above 0.7 except for rhetorical tags,which are greater than 0.6,proving that the quality of annotation is trustworthy.The significance of this project is to build the first Chinese multimodal argumentative corpus,and to ensure the quality of the corpus by obtaining high consistency of annotation through quality assessment.It also adds multimodal data to the text corpus and contains audio and video corresponding to the Chinese corpus to help the progress of multimodality on the Chinese corpus.
Keywords/Search Tags:argumentation mining, multimodality, emotion polarity, emotion classification
PDF Full Text Request
Related items