| With the rapid development of Internet technology,online forums have become the main medium for people to argue and debate.Users hold different arguments on a certain topic or event and conduct debates and discussions on the Internet,this has produced a large number of argumentative texts such as opinion comments.How to automatically extract argumentation structure from unstructured texts is an urgent problem to be solved,how to make good use of these argumentation texts contains great commercial value and brings huge challenges.This thesis focuses on the identification of claim in the field of argument mining and the fine-grained argumentation component recognition has been studied.In this thesis,the main contents includes as following:1.Claim identification problem in argument mining.Most of the current claim identification methods are to construct features for a specific field and use machine learning methods for recognition.The claim identification model for a specific field cannot be directly applied to other fields,and for most people,the concept of an argument is difficult to define with a set of compact definitions and clear rules,manually constructing features to identify claim is a complex and time-consuming activity.This thesis proposes a BERTBi LSTM-Attention claim identification method.Use the BERT pre-trained language model,and then use Bi LSTM to extract higher-dimensional features,better combine the context,use the attention mechanism to highlight important features,and achieve good results in claim identification,and then analysis to understand what are the more important features in the claim identification.2.The problem of fine-grained argument component identification in argument mining.Most current argument component identification uses the entire sentence as the argument unit,but in doing so,one argument exists in two sentences,or one sentence contains multiple arguments.To resolve the problems,we regard the fine-grained argumentation component recognition task as a sequence labeling task and a classification task.we propose a fine-grained argument component recognition method based on BERTBi LSTM-CRF,and input sentences into the neural network,which can not only mark the sequence of the sentence,automatically identify the boundary of the argumentation component,but also use multiple classification divides the sentences into claims,evidence and irrelevant sentences.3.Debate the realization of the prototype system for argument component identification.In order to show the identification of the argumentation component in the argumentation mining system more clearly,this thesis implements a prototype system for display based on the first two works.The argumentation component identification prototype system is mainly divided into data collection module,claim identification module,fine-grained argument component identification module and user display interface.Users can choose two modes,one is to directly display the argumentation component identification of crawling data,and the other is to automatically identify the arguments and arguments in the thesis by inputting an thesis into the system to help users better understand the core point of view expressed in the thesis. |