Font Size: a A A

Research On Document Level Event Extraction Method Based On Prompt Learning

Posted on:2024-03-01Degree:MasterType:Thesis
Country:ChinaCandidate:H M WangFull Text:PDF
GTID:2568306941984539Subject:Cyberspace security
Abstract/Summary:PDF Full Text Request
Event extraction technology has significant application value in the field of natural language processing.Its research continuously promotes the progress and innovation of natural language processing technology.The main task of event extraction is to automatically extract entities,event types,and their semantic relations related to specific events from unstructured text,and then convert these elements into a structured form for subsequent processing and analysis.Through event extraction,the efficiency of human obtaining useful information from massive texts can be greatly improved,providing support for many practical scenarios such as information retrieval,market intelligence,and public opinion monitoring.With the rise of pre-trained models,the current widely researched approach based on deep learning is to fine-tune on pre-trained language models according to downstream tasks.The drawback of this approach is that there is a problem of inconsistency between upstream and downstream tasks during the fine-tuning process,which leads to the inability to fully utilize the knowledge learned by the language model in the pre-training stage.Moreover,the model’s adaptability is poor in the case of a small number of samples.In addition,most studies still focus on event extraction within a single sentence,which results in incomplete event information and missing arguments.Furthermore,most extraction methods judge event types by recognizing and classifying trigger words,which is not universal and increases the difficulty of dataset labeling.To address the above problems,this paper proposes a document-level event extraction method based on prompt learning,mainly including the following two aspects of research.(1)A document-level event extraction method that does not rely on trigger words is proposed based on prompt learning.The currently widely studied sentence-level event extraction method is difficult to extract complete event arguments scattered in different sentences of the document.Therefore,this study proposes a document-level event extraction joint model,which can upgrade the granularity of event extraction from sentence-level to document-level,and can more comprehensively extract event information in the document,extract more accurate and complete event results,and has high practicality and application value.The model first identifies entity types in the document one by one through the named entity recognition module based on the multi-head attention mechanism to improve recognition accuracy and efficiency.Then,without the need for trigger words,the model finds the event center sentence of the document by measuring the importance of different argument roles to the event and determines the event type.Finally,based on the prompt learning idea,by defining corresponding event templates for each event type,the event argument extraction task is redefined as a conditional text generation task,which can simultaneously extract multiple event arguments,facilitate the model to learn information interaction and constraints between different argument roles,and introduce more inspiring knowledge.It can also fully utilize the advantages of pre-trained language models,making the model have good generalization ability in the case of insufficient training data.(2)An event template automatic generation strategy is proposed based on deep learning.The prompt template is crucial for prompt learning,and the quality of templates can greatly affect the model’s performance.Although manually constructed templates are relatively fluent and understandable,on the one hand,the process of manually constructing and testing template effectiveness is time-consuming and labor-intensive,on the other hand,even professionals may not be able to construct the optimal template manually.To address this issue,by introducing an event prompt encoder,the model can automatically learn deeper knowledge representation and obtain event templates composed of continuous embedding vectors rather than natural language,and this template generation strategy can learn the optimal representation of prompt information through the learning ability of a neural network model during the model iteration process.Experiments show that replacing manually designed natural language templates with continuous embedding vectors can greatly reduce the cost of template construction,while maintaining the same or even better performance.
Keywords/Search Tags:document-level event extraction, prompt learning, pre-trained language model, no trigger word, joint learning
PDF Full Text Request
Related items