| In order to efficiently extract the key content that people focus on from massive texts,event extraction technology is widely used to extract structured information from unstructured text data.However,most of the event extraction research work mainly focuses on sentence-level event extraction,and there is less in-depth research on the more complex and difficult document-level event extraction problem.Document-level event argument extraction is an important subtask of document-level event extraction,which aims to extract all argument objects involved in the event.This paper proposes a new model structure for the problems of insufficient utilization of known information and neglect of correlation between span boundaries in previous document-level event argument extraction research,and proves the effectiveness of the model structure through comparative experiments and ablation experiments.The main contributions of this paper include the following three points.(1)This paper proposes a document-level event argument extraction model based on role information guidance.In order to solve the problem of insufficient utilization of known information such as event type,event trigger word and event role in document-level event argument extraction task,this paper proposes a new research method: in the data preprocessing stage,the event type and the event roles contained in the event are concatenated with the original text as additional information,so that the additional information can be fully modeled with the original text.In the middle layer of the model,this paper integrates the event type and trigger word vectors into the event roles,increasing the amount of available information contained in the event roles.In the prediction layer,this paper uses the fused event roles to identify the argument boundaries in the original text.In comparison experiments with previous works,this paper’s model is second only to the current SOTA results.In ablation experiments,this paper proves the rationality and effectiveness of the above methods.(2)This paper proposes a document-level event argument extraction model based on bidirectional span detection.In order to effectively model the boundary of argument span,this paper designs a new model structure: by designing forward decoder and backward decoder,argument span is detected from both start and end boundaries of argument.In order to establish the correlation between start and end boundaries,this paper’s forward/backward decoder first identifies start/end boundaries of argument according to event role,then uses identified start/end boundaries to detect end/start boundaries of argument,and finally passes identified argument information backward to help detection of other arguments to be identified.In comparison experiments with previous works,this paper’s model with BERT-base as encoder has 0.7% higher F1 value than current SOTA results.In ablation experiments,this paper proves the rationality and effectiveness of the above model structure.(3)This paper takes event detection as an auxiliary task for document-level event argument extraction.In an event,the trigger word is the key vocabulary that reflects the relationship between different arguments,and it is the core that connects the whole event.And another subtask of event extraction-event detection,which requires identifying the trigger words in the text and judging the event type triggered by the trigger words.Therefore,this task can help the model better understand the complete composition of the event.Inspired by multi-task learning,this paper adds event detection as an auxiliary task in the above model.Experiments show that event detection auxiliary task can help model better understand target task and improve document-level event argument extraction effect.In ablation experiments,this auxiliary task brought 1.7% and 1.9% F1 value improvement for the above two models respectively. |