Font Size: a A A

Research On Information Extraction Algorithm For Marine Economic Industry Text Data

Posted on:2022-12-06Degree:MasterType:Thesis
Country:ChinaCandidate:J W MaFull Text:PDF
GTID:2518306779495844Subject:Economic Reform
Abstract/Summary:PDF Full Text Request
Named entity recognition and relation extraction,as the basic tasks of information extraction,can provide accurate and massive data support for knowledge bases.The knowledge base can not only lay a solid foundation for the development of related industries,but also analyze and summarize the industries through the knowledge base,and finally realize the sustainable and rapid development of related industries.However,at present,the research of information extraction still needs to be further developed.There are two problems in the named entity recognition task,how to effectively utilize the structural feature input of individual words and words and how to assign different high weights according to the importance of words.In relation extraction tasks,there is a problem of error propagation in traditional methods using dependency extraction tools to obtain structural dependency features between sentences.Therefore,for some problems of the two tasks,this paper constructs two new marine economic industry da tasets and studies two models for verification in the field of marine economic industry.The research contents mainly include the following two points.1.To address the named entity recognition problems,this paper proposes a Global Attention Grid Transformer Entity Recognition Model(GALT-NER).The model first outputs the word vector through the word embedding layer,then obtains the context feature vector through the Bi-GRU layer,then inputs the global attention layer and the Transformer coding structure layer to extract the feature vector,and finally splices the two vectors into the CRF layer for entity labeling Classification.2.To address the above problems in relation extraction,this paper proposes an Ordered Long Short-Term Memory-Multi-Head Attention Mechanism Relation Extraction Model(OL-MAM-RE).The model first outputs the word vector through the word embedding layer,and then enters the Bi-LSTM network to output the vector containing the contextual features,and then inputs the On-LSTM layer to output the feature vector containing the dependent structure relationship,and then enters the multi-head attention mechanism to obtain the word.The feature vector is finally input to the fully connected layer for dimensionality reduction,and the probability of possible relationship labels between entities is obtained.The performance of the two models in this paper is tested on the marine economy industry data set and public data set.The experimental results show the effectiveness of the above two models,verifying that the GALT-NER can effectively use words.The structure enriches the diversity of vectors,and it also proves that the OL-MAM-RE based on On-LSTM can effectively obtain the feature vectors of the dependent structure between words in a sentence.In future research,we can study the application of entity and relation joint extraction in the field of marine economy industry.
Keywords/Search Tags:Knowledge Bases, Deep Learning, Named Entity Recognition, Relation Extraction
PDF Full Text Request
Related items