Font Size: a A A

Research On Cross-lingual Named Entity Recognition

Posted on:2024-09-12Degree:MasterType:Thesis
Country:ChinaCandidate:L K ZhouFull Text:PDF
GTID:2558307106968639Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Naming entity recognition is a very important technology in natural language processing.The main purpose is to identify entities with specific significance from the text to help follow-up related research.Deep learning models in multilingual scenes require rich high-quality labeling data to ensure the accuracy of the model,but the acquisition of low resource language resources is more difficult.In this context,the information of high resource language is transferred to the low resource language,and the name of a cross-lingual naming physical identification task is carried out to solve the problem of lack of data.It is currently an important research direction.This article discusses the relevant methods of cross-lingual named entity recognition,and summarizes the current method of multi-lingual pre-training models.The following two aspects are focused on the above problems:1.Put forward a cross-lingual name identification method based on confrontation training and attention mechanism.Although the MBERT-based cross-language naming physical recognition model is expressed by a rich and universal word vector,the dependency relationship between words and words still yet to be enhanced.Bilingual characteristics of use of word-level confrontation training,reducing bilingual vector space distance.To solve the problem of thin corpus,at the same time,the model is more concerned about the structure and semantic information by introducing the dual attention mechanism to obtain the final word vector representation.2.A cross-lingual naming physical identification method that combines the combination of combination of training and knowledge distillation.In order to overcome the catastrophic forgetting problems caused by the lack of target language labels such as BERT and other pre-training models.Models to guide another student model to learn cross-language named physical identification tasks without supervision.By adding noise in the process of source language training to achieve confrontation training,and use a temperature parameter to adjust the amount of information passed from teachers to students.In this way,The student’s model can learn from the two languages and reduce the possibility of forgetting or overfitting.Finally,a large number of experiments show that the performance of this model on People Daily2004 and Wiki Ann is better than the baseline model.You can learn more cross-lingual information between different languages and have better recognition performance.
Keywords/Search Tags:Cross-lingual Named Entity Recognition, Knowledge Distillation, Adversarial Learning
PDF Full Text Request
Related items