Complex Chinese Named Entity Recognition In Finance

Posted on:2021-04-14

Degree:Master

Type:Thesis

Country:China

Candidate:Z Y Zhu

Full Text:PDF

GTID:2428330614970087

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Named entity recognition is a basic task in the field of natural language processing and plays a pivotal role in tasks such as information extraction,machine translation,and knowledge graph construction.It has also received widespread attention in financial,biological and pharmaceutical industries.Generally speaking,before model training,named entity recognition needs to manually label a large amount of text data to ensure the richness of the sample,and then machine tagging is used to train the tagger.At present,most of the researches on named entity recognition are short entities.In the field of annotated corpus data,named entities based on fully supervised learning have achieved high performance.Because the process of labeling data is time-consuming and labor-intensive,most Only partially labeled data exists.In the case of insufficient labeled data,weakly supervised iterative learning is usually used to gradually train the model.The research in this paper is mainly aimed at the problem of complex entities in the text in the financial field and insufficient labeling data.The commonly used named entity recognition schemes cannot effectively identify complex entities in the text in this case.This paper proposes a method of weakly supervised learning to recognize the complex named entities(commonly composed of multiple small entity sequences,hereinafter referred to as CNEs)in the corpus,which makes it difficult to determine the boundaries of such entities.To improve the recognition accuracy,our method is proposed to separate the context semantic relationship determination from the entity boundary confirmation.The specific work is as follows:1)In this paper,we propose a semantic model based on CNEs mask processing.Before training,the CNEs in the corpus will be masked,and then use the masked corpus training the semantic model through Bi LSTM-CRF.2)And we also propose a weakly supervised CNEs boundary confirmation model based on sequential patterns.In the small sample data set,the target CNE candidate set is found by sliding window combined with sequence pattern matching,and then it is effectively screened and judged by the semantic understanding model obtained in 1).3)In addition,the complex entities in the text also affect the effectiveness of weakly supervised training to a certain extent.In this regard,this paper proposes an Optimized-Bootstrapping algorithm based on the sample similarity scoring mechanism.It can effectively improve the selection of incremental samples Reliability of incremental samples in weakly supervised iterative learning.In this paper,the data in the financial field is used as an experimental data set to compare the effects of the currently popular models in named entity recognition and the proposed scheme.The results show that the method proposed in this paper is more direct The named entity recognition method based on Bi LSTM-CRF has greatly improved the performance of small data training samples,and the proposed method has certain generalization ability.

Keywords/Search Tags:

named entity recognition, weakly supervised learning, deep learning, pattern matching, high dimensional index

PDF Full Text Request

Related items

1	Research On Tibetan Named Entity Recognition Based On Weakly Supervised Learning
2	Weakly Supervised Named Entity Recognition Based On Online Encyclopedia
3	A Research On Weakly Supervised Relation Extraction
4	Image Data Annotation And Recognition Based On Weakly Supervised Deep Learning
5	Research On Named Entity Recognition Based On Deep Learning
6	Research On Nested Named Entity Recognition Algorithm Based On Deep Learning
7	Research On Weakly Supervised Human Action Analysis Based On Deep Learning
8	The Research Of Chinese Named Entity Recognition Based On Deep Learning
9	Research On Named Entity Recognition With Deep Learning
10	A Research On Entity Relation Extraction Model And Performance Improvement