Font Size: a A A

Research On Listed Company Announcement Document-level Event Extraction

Posted on:2022-09-08Degree:MasterType:Thesis
Country:ChinaCandidate:W C WangFull Text:PDF
GTID:2518306569997449Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years,China securities market and the Internet have continued to prosper and develop,and the number of announcements disclosed by listed companies on the Internet has increased year by year.The announcement can be used as a basis for investment decisions in the sec urities market,as well as an information for government departments to conduct compliance checks on listed companies.Due to the timeliness of the announcements and inefficiency of human reading,it is important to study the event extraction model on announcement texts.For event extraction models,most researches focus on sentence-level event extraction model that use trigger word as event instance ontology.With the increasing demand for event extraction in the financial announcement field,some announcement data sets and different methods have been proposed.The announcement data provides a complex document-level semantic environment,it is important to study fully using this information to resolve event extraction.The announcement event extraction can be divided into two parts,one is named entity recognition for event elements,and the other is event category classification and event role classification for event instance ontology.The characteristic of entity recognition for announcement event elements is that it provides a document-level context,it is specifically manifested in that an entity can appear in multiple sentences.It is reasonable to maximize the context range of model input to fully mining the information.The BERT network is superior in modeling sequence and the LSTM network can feasibly process sequence regardless of length,thus this thesis proposes the BERT-LSTM network that uses the BERT network to model the feature within the sentence and uses the LSTM network to model the feature across sentences.In order to further improve the performance of the network,this thesis uses multi-task method to strength model.The experimental results show that the multi-task learning BERT-LSTM network has better performance compared with other models.The event extraction task includes event category classification and event role classification.For event category classification,since different event categories need to focus on different sentences,this thesis models multi-label classification based on the attention mechanism of the event category to sentences.For event role classification,since the announcement data is not labeled with trigger words,this thesis adopts the sequence generation style method to generate event structure from entities.The sequence generation style event extraction method requires the model to mining the interaction between entities,thus this thesis proposes directly modeling the correlation score of entities to boost the performance.In order to further improve event role classification model,this thesis proposes integrating the correlation score into the model memory embedding.The experiment results show that modeling and integrating the correlation of entities can improves the performance of event role classification.
Keywords/Search Tags:document-level event extraction, named entities recognition, multi-task learning, correlation of entities
PDF Full Text Request
Related items