Font Size: a A A

Researches On Domain-Specific Named Entity Recognition

Posted on:2019-02-17Degree:MasterType:Thesis
Country:ChinaCandidate:N ZhangFull Text:PDF
GTID:2428330548979805Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The appearance of Knowledge Base(KB)makes tremendous data in Internet can be efficiently used in many systems like Search System,Intelligent Question-Answer System and Reading Comprehension System.To establish knowledge base,we need to transfer a large number of unstructured data into structured data,and save them into database.Named Entity Recognition(NER)is the most fundamental and significant step while building the knowledge base,which aims to identify all the proper nouns of a given unstructured data and classify them.The deep learning based NER models have achieved great success on general fields like Newswire and Forum in recent years,however in domain-specific NER,due to insufficient labeled training data,deep models usually fail to behave normally.In this paper we proposed two deep learning based NER methods on specific fields:(1)Neural Inductive TEaching framework(NITE),which can transfer knowledge from existing domain-specific NER models into an arbitrary deep neural network in a teacher-student manner.NITE is a general framework that builds upon transfer learning and multiple instance learning,which collaboratively not only transfers knowledge to a deep student network but also reduces the noise from teacher.NITE can help deep learning methods to effectively utilize existing resources(i.e.models,labeled and unlabeled data)in a small domain.The experiment resulted on Disease NER proved that without using any labeled data,NITE can significantly boost the performance of a CNN-bidirectional LSTM-CRF NER neural network nearly over 30%in terms of FI-score.(2)Adversarial Multi-task Learning NER model.Since there are always many sub-domains in a specific domain,and the information of these sub-domains are relevant but different,with the help of multi-task learning and adversarial training,we can utilize all these information to simultaneously improve NER accuracy of multiple sub-domains.The experiment resulted on NER of biomedical field proved that we can improving the identify accuracy of multiple sub-tasks together.These two methods can address the problem of insufficient and expensive labeled data.This research has been applied in 973 national project China Knowledge center for engineering sciences and technology project,and the paper of NITE has been accepted by EMNLP 2017.
Keywords/Search Tags:NER, Inductive Learning, Multiple Instance Learning, Multi-task Learning, Adversarial Training
PDF Full Text Request
Related items