Font Size: a A A

Research On BERT-Based Open-Domain Event Extraction Method

Posted on:2024-07-15Degree:MasterType:Thesis
Country:ChinaCandidate:Q ZhangFull Text:PDF
GTID:2568307106968019Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Open domain event extraction task refers to extracting event information without pre-defined event types.This task is usually based on methods such as pre-training or neural topic.However,there are some problems with the existing methods.First,the existing pre-trained models have inadequate feature vector extraction,high embedding dimensionality,and the problem of multiple parameters for the same parameter type.Second,the existing methods have insufficient semantic richness and lack of syntactic structure information,resulting in poor readability of results and insufficient extraction accuracy.Therefore,to address these challenges,this paper firstly improve the open-domain event extraction method based on BERT’s neural topic model,and then dynamically integrates semantic and syntactic dependency information to obtain rich semantic and syntactic features,so as to further improve the model performance.The main research contents are as follows:(1)Propose an improved method of neural topic model based on BERT.First,pre-training is performed at the coding layer using BERT to obtain a contextual representation of the feature sequences;second,Umap dimensionality reduction is used to obtain more extensive local and global information,which is combined with a deep hidden variable probability map model to obtain the joint distribution of variables after further optimization of the parametric inference learning process;finally,to mitigate the effects of noisy data,we introduce a self-attention mechanism that assigns weights to different node information,making the model focus on more critical features and further improving the performance of the event extraction model in the open domain.(2)An improved open domain event extraction method based on dynamic integration of semantics and dependency syntax is proposed.On the one hand,BERT final layer features are output with the Bidirectional Long Short-Term Memory Network to get rich semantic information.On the other hand,in order to avoid mutual interference between syntactic and semantic information,BERT intermediate layer features are used to obtain semantic features through Bi-LSTM,dependency syntactic information is introduced,Stanford Core NLP tool is used to analyze dependency syntactic information and convert it into graph information,and Dependency Attention aware Graph Convolutional Network are used to increase nodes’ attention to the feature in a graph.Finally,by introducing a gating mechanism that dynamically fuses semantic information and semantically enhanced dependent syntactic information obtained from the above two aspects respectively for weighting to obtain a rich and accurate fused feature representation and improve the model extraction capability.Finally,to verify the effectiveness of this method in open domain event extraction tasks,in this study,experiments were conducted on the GNBusiness datasets,and the results demonstrated that both methods are effective in helping to increase the performance of open domain event extraction..
Keywords/Search Tags:Event Extraction, Open-domain Event Extraction, Neural Topic model, Semantic dependency syntax, Self-attention
PDF Full Text Request
Related items