Font Size: a A A

Research On Event Extraction Technology For Large-scale Unstructured Text

Posted on:2021-10-29Degree:MasterType:Thesis
Country:ChinaCandidate:Z G KanFull Text:PDF
GTID:2518306548495854Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Event extraction is an important subfield of information extraction.Its task is to extract structured event information from natural language text.Since the end of the last century,researchers have proposed many methods and models for event extraction task,and achieved good results at that time.However,there is still a great room for improvement in the accuracy of event extraction task.In addition,in the face of the increasing complexity of event extraction models and the need to process large-scale texts,the industry hopes to improve the efficiency of event extraction.This paper conducts in-depth research on the key technologies of large-scale English text event extraction,proposes two event extraction methods based on structured self-attention mechanism and dilate gated convolutional neural network,and implements a parallel event extraction system.The main work and innovations of this paper are as follows:First,in order to alleviate the problem of lack of semantic information and potential connections between words in the existing work,an event extraction method based on structured self-attention mechanism is proposed.According to the definition of trigger and event argument,the method merges various information into the distributed representation of words.By constructing a structured self-attention mechanism,the method characterizes the prior relationship between words,and further automatically learns the potential association between words in the sample on the data set.It enhances the context information in word features and improves the performance of event element extraction.The comparative experiments based on the ACE2005 event corpus show that the accuracy of the method is higher than other methods.Secondly,for the application scenario of event extraction technology on small computing devices,a light event extraction method with fewer parameters is proposed.The method builds a multi-classifier for high-dimensional word features based on dilate gated convolutional neural network.The method uses data enhancement techniques and the method of label weighting to alleviate the problem of uneven sample distribution in the training corpus,and strengthens the ability of learning small-class label data.Experiments based on the ACE2005 corpus show that the accuracy of event argument extraction through this method is higher than similar methods.Finally,in the face of the need to improve the efficiency of event extraction,this paper designs and implements a parallel event extraction system.Aiming at the problems of memory overflow and process waiting,this paper rationally divides the model and proposes a dynamic subtask scheduling strategy.Experiments show that the parallel event extraction system can effectively improve the efficiency of event extraction.
Keywords/Search Tags:Event extraction, Structured self-attention mechanism, Dilated Convolutional Neural Network, Parallel processing
PDF Full Text Request
Related items