Font Size: a A A

Research On Named Entity Recognition With Deep Learning

Posted on:2021-05-27Degree:MasterType:Thesis
Country:ChinaCandidate:P LiFull Text:PDF
GTID:2428330623468526Subject:Engineering
Abstract/Summary:PDF Full Text Request
Named entity recognition?NER?is a very important basic research task in the field of natural language processing?NLP?.It is an essential part of many NLP high-level applications such as question answering systems and search engines.Accuracy of identifying named entities will directly affect the subsequent performance of these tasks.Most of the early methods for named entity recognition were based on rules or statistics.These methods need to manually construct rule templates or design statistical features according to specific languages and domains.Therefore,the NER systems based on these methods have the defects of high implementation costs and poor portability.Deep learning methods can automatically learn feature representations from datas.In the past decade,they have achieved remarkable performance in various tasks of artificial intelligence.Compared with traditional methods,deep learning-based entity recognition models not only have better labeling effect,but also have a lower implementation cost.However,these method also has many shortcomings:?1?NER is a task needs to rely heavily on contextual information to make decisions.So most recognition models will use BiLSTMs as encoders to capture the dependencies of the input.However,the training time of the LSTM network is very long,and the input and state of the traditional LSTM are independent of each other.?2?Generally,NER models need to information from various aspects of text sequences to make labeling decisions,and these information are represented by their respective vectors.Annotation models usually stitch these vectors directly,then it uses the stitching results as input.However,this method will lead to the problem of information redundancy.In this paper,in order to address the above issues,the author has made the following efforts:1.In order to solve the problem of long training time of LSTM,the author proposes a method that encode the input sequence based on the Dilated Convolutional Neural Network?dilated-CNN?.The dilated-CNN expand the receptive field by injecting holes in the convolution kernel of traditional convolutional neural networks?CNN?.The experimental results show that the dilated-CNN's capabilities of encoding semantic dependence can compare with the bidirectional LSTM model.When the dilated convolutional network and the attention mechanism are combined,the labeling effect can be further improved.2.In traditional LSTM networks,the input xt at each time step and the hidden layer state ht-1 at the previous time step are completely independent of each other.They only interact in the gate of the LSTM and have nothing to do before the input,which may cause the loss of context information of the input sequence.Therefore,in this paper,the author proposes to use the Mogrifier LSTM network to model the input sequence.Mogrifier LSTM does not change the structure of the traditional LSTM network,it just allows xt and ht-1 to interact with each other before they are imported into LSTM network.Through comparative experiments,it is found that the Mogrifier LSTM network has a stronger ability to model context dependence than the traditional LSTM network.When the attention mechanism is integrated on the basis of Mogrifier LSTM,the model can obtain more contextual semantic information.3.The author proposes a completely new vector connection method based on the attention mechanism to connect word embedding and character-embedding of input word.The experimental results show that,compared with the traditional method of directly stitch vectors,the model of using this new connection method can achieve better labeling results.
Keywords/Search Tags:Named Entity Recognition, Deep Learning, Dilated Convolutional Network, Mogrifier LSTM, Multi-Head Attention Mechanism
PDF Full Text Request
Related items