Font Size: a A A

Named Entity Recognition In Cross Language And Cross Domain Situations

Posted on:2021-06-05Degree:MasterType:Thesis
Country:ChinaCandidate:T T HuangFull Text:PDF
GTID:2518306245481894Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
Named entity recognition is a basic task in the field of natural language processing.The effect of entity extraction directly affects the upper bound of downstream tasks such as relation extraction and event extraction.Traditional entity extraction technology mostly uses dictionary rules and statistical machine learning methods.In the low resource target language and target domain,it cannot effectively extract entities.With the emergence of cross language word vector,transfer learning and deep neural network model,it provides a new way to solve the above problems.The purpose of this paper is to explore the problem of entity extraction in cross language and cross domain situations,that is,how to effectively extract the target language and the target domain when the target language and the target domain entity tag corpus are few and the text noise is much.In the current research,transfer learning and deep learning have not been systematically applied to cross language and cross domain named entity recognition.In view of this,CL-NER(Cross Language-Named Entity Recognition),a cross language named entity recognition framework based on tag transfer learning and deep learning,and CD-NER(Cross Domain-Named Entity Recognition),a cross domain named entity recognition framework based on parameter transfer learning and deep learning,are proposed.Specifically,the main work of this paper includes the following two points:1)CL-NER,a cross language named entity recognition framework,which integrates transfer learning and deep learning,is proposed.The cross language entity extraction is divided into two sub modules: cross language label mapping and named entity recognition.Based on the idea of tag transfer learning,the tag data of the source language is transferred to the target language through three methods of Cheap Translation,Lexicon Induction and Self-Learning.On this basis,three different named entity models are established for the target language.In the cross language tag mapping module,the Self-Learning method gets the best experimental results;in the named entity recognition module,GRU-LSTM-CRF deep learning model gets the best experimental results.When the corpus resource of the target language is limited,we can use the abundant annotation corpus of the source language and the cross language transfer learning method to improve the effect of the named entity recognition of the target language.2)CD-NER,a cross domain named entity recognition framework,which integrates transfer learning and deep learning,is proposed.The cross domain entity extraction is divided into two sub modules: named entity recognition and cross domain parameter transfer.In the named entity recognition module,the LSTM-CRF deep learning model is used to extract entities in both the source domain and the target domain.Based on the idea of parameter transfer learning,through multi task learning and pretraining,parameter migration from source domain to target domain is realized.In the cross domain parameter transfer module,the best experimental results are obtained by the BERT-FineTune method.When there is a lot of text noise in the target domain,we can use cross domain transfer learning method to improve the effect of named entity recognition in the target domain with the help of pre training model in the source domain and labeled corpus.
Keywords/Search Tags:Knowledge Acquisition, Entity Extraction, Cross Language, Cross Domain, Deep Learning
PDF Full Text Request
Related items