Font Size: a A A

Research Of Chinese Named Entity Recognition Based On Recurrent Neural Networks

Posted on:2019-09-01Degree:MasterType:Thesis
Country:ChinaCandidate:R D ZhangFull Text:PDF
GTID:2428330593450210Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Named entity recognition refers to identifying referential naming and proper nouns in the text.The Chinese named entity recognition is the basis of Chinese information processing and currently faces two major challenges.On the one hand,traditional methods often rely on external knowledge and manual screening features,which require high labor costs and time costs.On the other hand,the need to identify new types of entities is increasing,which poses a great challenge to the identification method.Recurrent neural network is suitable for processing sequence data,which is a popular method in Natural Language Processing field.This paper explores the method of using recurrent neural network to identify Chinese named entities,which is of great significance to Chinese information processing.Recurrent Neural Networks(RNNs)is a neural network commonly used in deep learning.Its input consists of the current sequence input and the hidden layer state at the previous moment,which means that its output is not only determined by the current time sequence input,but also depends on the previous hidden layer state.In this way,the recurrent neural network can learn sequence information and is very suitable for processing sequence data.Due to the lack of obvious word boundaries in Chinese sentences,this article uses a recurrent neural network to handle the Chinese character-level named entity recognition tasks.The work done in this paper is as follows:(1)The traditional named entity recognition method lacks the ability to learn long-distance dependence,and when extracting and processing features,it needs to combine external knowledge and a large amount of human participation.For this limitation,this paper designs a named entity recognition method based on the recurrent neural network.This method uses a bidirectional LSTM to process input statements and assign each word an appropriate label.Because there is a strong dependency relationship between named entity tags,this article links a CRF layer at the output layer of the neural network.This method learns the dependencies between tags and gives the global optimal tag sequence at the sentence level.The experimental results on the People's Daily corpus show that the Bi-LSTM-CRF based method designed in this paper can effectively identify Chinese named entities and does not require feature engineering.It is an end-to-end Chinese named entity recognition method.(2)With the application of Natural Language Processing technology in various fields,the object of named entity recognition is no longer limited to the traditional entity types such as person,localizatioin and organization name,and the need for identifying new types of entities is increasing.When identifying entities in a specific domain,there is often a problem that only a few or even no marked corpora are available.This article builds a corpora for identifying meeting names based on the needs of conference identification in the field of intelligence analysis,using a bidirectional GRU combined with CRF to identify the conference name.For identification,this method does not require the addition of new domain knowledge and can avoid the tedious work of designing features for specific areas.In order to further improve the recognition effect,a language model based on a recurrent neural network was designed in this paper.It was trained using an existing large-scale corpus(hereinafter referred to as a secondary corpus),and then use the trained language model to generate word vectors to enrich the characteristics of the original word vector and thus improve the recognition model.Experimental results show that this method can effectively improve the effectiveness of the recognition model.
Keywords/Search Tags:chinese named entity recognition, conference name recognition, recurrent neural networks, language model, word embedding
PDF Full Text Request
Related items