Research On Chinese Named Entity Recognition Based On Temporal Convolutional Network

Posted on:2021-04-23

Degree:Master

Type:Thesis

Country:China

Candidate:W Jiang

Full Text:PDF

GTID:2428330611464264

Subject:Computer system architecture

Abstract/Summary:

PDF Full Text Request

With the advent of the information age,how to obtain useful information from massive natural language data is a very important problem in today's society.Natural language data is different from other types of data.It contains the characteristics of human language and needs some special technologies to process.Named entity recognition is a basic task of natural language processing.It solves the problem of information overload by extracting key information with special meaning in the sentence,which is also called named entity.Named entity recognition has been one of the research hotspots of experts and scholars at home and abroad.The Chinese language is different from other languages.For the processing of the Chinese language,the corresponding characteristics of the Chinese language need to be considered.Most current Chinese named entity recognition models can be roughly divided into two categories according to the choice of basic units of language processing: character-based models and word-based models.The character-based model directly cuts Chinese sentences into a sequence of characters and then extracts named entities on this sequence of characters.The word-based model needs to first segment a Chinese sentence into a word sequence through a word segmentation model,and then extract named entities on this word sequence.However,character-based models cannot use the rich internal information of words to complete the extraction of named entities,and word-based models cannot eliminate the ambiguity of words in the word sequence segmented by the segmentation model according to different segmentation standards.In view of the problems of the above two types of models,existing research integrates a predefined dictionary to automatically adapt the words in Chinese sentences and feeds back the adapted word information to the named entity recognition model.The name entity recognition model processes Chinese sentences according to characters and incorporates the relevant information of the words in the Chinese sentence.However,the way to integrate into the dictionary needs to set up a dictionary in advance,and can not guarantee the unbiased nature of the dictionary.Different from the way of integrating into the dictionary,this paper studies Chinese named entity recognition by extracting the location features and class features of named entities.To this end,starting from the research of Chinese word segmentation model,in order to effectively obtain character information farther away in long sentences,a Chinese word segmentation model based on temporal convolution network is proposed;On this basis,the corresponding temporal convolution network is constructed separately.One is the convolutional network for extracting the location features of named entities and the other is the convolutional network for extracting class features of named entities.It obtains the final named entity by performing feature fusion on the two types of features.The main research work is as follows:(1)Most current Chinese word segmentation models are based on bidirectional long-short-term memory networks(Bi-LSTMs).The Bi-LSTMs model has the problem that the gradient disappears,and it cannot effectively process long sentences.In this regard,this paper proposes a Chinese word segmentation model based on the temporal convolutional network.By increasing the number of layers of the convolutional network,character information farther away in long sentences can be effectively obtained.The model builds a multi-layer temporal convolutional network as an encoder,uses a fully connected neural network as a decoding layer,applies conditional random field(CRF)to analyze the correlation of adjacent characters,and uses the Viterbi algorithm to gain the final word segmentation class identification sequence.The experimental results on multiple word segmentation data sets show that the model has good word segmentation performance and can effectively obtain character information farther away in long sentences.(2)On the basis of the Chinese word segmentation model,a symmetric dual temporal convolutional network is proposed.The BERT pre-training model is used to generate the pre-trained embedding vectors of the characters.The temporal convolutional network for extracting location features and the temporal convolutional network for extracting class features respectively obtain the location and class features of named entities.A fusion algorithm is designed to fuse the location and class features to obtain the final named entity.The experimental results on multiple Chinese named entity recognition data sets show that the model has a higher F1 index than the existing Chinese named entity recognition models,and the named entity recognition performance of the model better.Experiments on the Boson Chinese named entity recognition data set show that the model can effectively process long sentences.Experiments for the model handle the Chinese sentence with word segmentation ambiguity given the specific input sentence show that the model can effectively handle Chinese sentences with word segmentation ambiguity.

Keywords/Search Tags:

Artificial intelligence, Natural language processing, Chinese named entity recognition, Temporal convolutional network, BERT

PDF Full Text Request

Related items

1	Research On Nested Named Entity Recognition Algorithm Based On Deep Learning
2	The Research On Chinese Named Entity Recognition Model Based On Cascade Neural Network
3	Research On Chinese Named Entity Recognition Based On Deep Learning
4	Zhuang Named Entity Recognition Based On Deep Learning
5	A Study On Chinese Named Entity Recognition
6	Domain Adaptation Research And Application Of Named Entity Recognition
7	Research On Chinese Named Entity Recognition Based On Convolutional Neural Network
8	Research On Chinese Named Entity Recognition Based On Deep Learning
9	Study On Recognition Of Chinese Agricultural Named Entity With CRF
10	Research On Identification Of The Chinese Named Entity Based On Deep Learning