| Named entity recognition is a fundamental problem in information extraction and serves many downstream applications.At present,the method based on Span is widely used in named entity recognition.It can not only deal with flat named entity recognition,but also deal with nested named entity recognition.However,most existing methods based on Span adopt the method based on sequence annotation to detect entity boundaries.In this way,when entity boundaries are combined to generate entity segments,there may be too many invalid entities and the number of entity segments explodes.In addition,entity labels contain a wealth of prior information,which can guide the model to identify the corresponding entity.However,the existing methods cannot fully integrate the tag knowledge into the text.In view of the above problems,this thesis carries out the following research.(1)This thesis proposes a named entity recognition model based on bidirectional pointer network.In order to solve the problem that there are too many invalid entities and the number of entity fragments explodes when generating entity fragments in existing methods this thesis uses pointer network mechanism to detect entity boundaries.First,the word embedding vector corresponding to the text is obtained.Secondly,the word embedding vector is input into the entity boundary detection module to identify the entity boundary.In order to avoid the problem of missing entities when dealing with nested named entities in the existing methods of entity boundary detection based on pointer network mechanism,two decoders are used in the entity boundary detection module: right decoder and left decoder.The right decoder takes the word currently entered into the decoder as the start position of the entity,and then uses the pointer network mechanism to find the corresponding end position of the entity.The left decoder takes the word currently entered into the decoder as the end position of the entity,and then uses the pointer network mechanism to find the corresponding entity start position.Finally,entity fragments are generated based on entity boundary combinations and classified into corresponding entity categories.The results of experiments on three datasets(MSRA,Onto Notes4.0 and ACE2004)show that the proposed model is superior to most other named entity recognition models.(2)This thesis proposes a named entity recognition model based on label knowledge enhancement.In order to make full use of the tag knowledge,this thesis will encode the text and the tag knowledge separately,and then integrate the tag knowledge into the text.First,construct the tag knowledge corresponding to the tag.Secondly,text and label knowledge are encoded independently by a shared encoder.Then,a semantic fusion module is designed to integrate the tag knowledge into the text one tag knowledge at a time.Finally,the text integrated with the label knowledge is input into the entity boundary detection module to identify the entity corresponding to the label.In the entity boundary detection module,the pointer network mechanism is still used to identify the entity boundary,so as to avoid the problems that may exist in the traditional method based on sequence annotation.Ample experiments have been conducted on ACE2004 data set,and the experimental results show that the proposed model is superior to most other named entity recognition models. |