Font Size: a A A

Semi-Supervised Disentangled Transfer Algorithm On Named Entity Recognition

Posted on:2022-07-30Degree:MasterType:Thesis
Country:ChinaCandidate:D LvFull Text:PDF
GTID:2518306539462594Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Named entity recognition(NER)is one of the most important and fundamental tasks in natural language processing.However,despite the widespread use of NER models in a variety of applications,both traditional machine learning and deep learning algorithms are still very rely on large-scale annotated data with the same distribution,so the generalization ability of model is weak.Actually,data sets of NER tasks are small and related with their domain in actual applications,it is high cost to obtain a large-scale labeled data set.Additionally,the data usually does not satisfy the assumption of independent identical distribution.So the model trained on source domain usually obtain worse performance on target domain.Transfer learning can train model on the source domain and migrate to the target domain through the transfer algorithm.NER model on target domain can obtain good experimental results by transfer learning,which rich labeled data from source domain can improve the generalizability of a model trained on the target domain.Fortunately,the corresponding grammatical substructures of entities between two different domains are similar in NER task,which belong to a part of domain-independent information and can be transferred between different domains.The type of entity is determined by the subject information of the domain,which is part of the domain-specific information.And it can help to improve the model performance.However,the mainstream cross-domain NER models are still faced two problems(1)How to extract domain-invariant information such as the syntactic structure information of the entity between the source domain and the target domain.(2)How to integrate domain-specific information such as syntactic information into the model to improve the performance of NER.In view of above problems,the main of the paper includes:1)In order to effectively extract domain-invariant information between source domain and target domain,in this paper a semi-supervised transfer learning algorithm of NER model based on deep learning were proposed.In the proposed algorithm,the domain-specific information can be captured by using a domain predictor.Three mutual information regularization terms are used to decouple the domain-invariant and domain-specific information.After decoupling information between source domain and target domain,they are used to predict labels of entities,which are able to improve performance of model on target domain.2)The problem of extracting domain-independent information in the absence of the relevant tags of the entity corresponding to the relevant substructure solved.Model can extract domain-invariant information and domain-specific information in limited supervision signal.The results of cross-domain experiments,cross-lingual experiments and low-resources experiments show that model can obtain great performance on the corresponding data set.By disentangling domain-invariant information and domain-specific information based on mutual information technology,it can improve model performance.
Keywords/Search Tags:Named Entity Recognition, Transfer Learning, Domain-invariant information, Domain-specific information, Disentanglement
PDF Full Text Request
Related items