Named entity recognition(NER)is one of the important tasks in natural language processing(NLP).This task is to study how to identify named entities with specific meaning in a given text.Most early NER studies were based on manually formulated rules,which have limited performance in practice.Deep learning has become the newest method for NER tasks.This method models problems as sequence labeling tasks and automatically learns features from the data.However,obtaining labeled data of NER is very expensive,so the lack of high-quality labeled data is still the main bottleneck restricting the development of NER tasks in different fields and affecting the performance of training models.In response to the above problems,this paper proposes two methods for performing domain naming entity recognition tasks without tagging corpus and expert knowledge.main tasks as follows:1.This paper proposes an iterative template-based semi-supervised method—CPL,which completes the domain NER task.Given a small number of initial seed entities,iteratively extract more entities from a large-scale corpus,learn new entities and extract templates by filtering through multiple constraints.This article also introduces a set of parallel words based on hearst pattern for entity extraction,in order to make full use of the information provided by the corpus and increase the efficiency of named entity recognition.According to the characteristics of domain entities,the method of domain part-of-speech template is used to improve the recognition rate of domain entities.2.This paper proposes a NER model based on noisy data.The model has two modules:a label modifier module and a label predictor module.The label modifier corrects the wrong label through reinforcement learning,and inputs the corrected label into the label predictor.The tag predictor makes sentence-level judgments and provides rewards for tag modifiers.The two modules are jointly trained to optimize the process of label correction and label prediction.3.Experimental results show that the CPL method proposed in this paper can extract general domains and specific entities from large-scale unlabeled corpora.The RLNER method can effectively handle the noise in the original data through a small amount of correctly labeled data.Compared with the existing method,the method 2 method achieves better performance in the NER task with noise. |