| The outbreak of the COVID-19 pandemic has had a tremendous impact on global society,economy,and health.Due to its high infectivity and mortality rates,COVID-19 has become a focal point of global media,governments,medical institutions,academia,and the public since 2020.It is crucial to extract and understand COVID-19-related information quickly and accurately.Applying event extraction technology to the field of COVID-19 news enables the automatic extraction of relevant information from a large volume of COVID-19 news reports,which assists in real-time monitoring of the changing trends of the COVID-19 pandemic.Additionally,event extraction allows for the extraction of information such as the time,location,and confirmed cases from COVID-19 news reports,facilitating the analysis of the spread patterns of the pandemic,prediction of its progression,and providing scientific decision-making support for governments and the public.This paper focuses on the research of document-level event extraction technology in the field of COVID-19 news,and the main research contents are as follows:First,addressing the challenges of limited research and difficulties in dataset construction in the field of COVID-19 news,this paper constructs a dataset specific to COVID-19 news and proposes a lightweight document-level event extraction model.The model utilizes co-reference information to obtain richer document-level entity representations and reduces the impact of propagation errors through the use of core roles.Experimental results demonstrate that compared to baseline models,this model exhibits higher performance on both the COVID-19 news dataset and the Chinese financial domain dataset(ChiFinAnn).Second,this paper proposes a Graph-based Heterogeneous Interaction Event Extraction(GPAIT)model that combines positional embeddings with attention matrices.This model enhances the contextual semantic relationships between entities through the construction of an entity attention relation matrix and joint filtering with a heterogeneous graph.Additionally,it introduces a Graph Convolutional Network(GCN)with fused positional embeddings to better capture semantic sequential relationships.Experimental results show that the GPAIT model achieves better precision in argument extraction tasks on both the Chinese financial dataset and the COVID-19 news dataset.Finally,this paper designs and builds an event argument annotation system,including modules for project creation,event operations,file integration,article operations,and annotation.The system provides functionalities that facilitate user access to article information and annotation across different domains’ datasets.In conclusion,the research findings of this paper have significant implications in the field of document-level event extraction for COVID-19 news.By constructing datasets and proposing lightweight and graph-based interaction event extraction models,this paper provides valuable insights and methods for automatically extracting relevant information from a large volume of COVID-19 news reports,analyzing the spread patterns of the pandemic,and providing decision-making support. |