| Against the backdrop of accelerating digital and intelligent transformation,information extraction technology is increasingly applied in various fields such as judiciary,medical,and finance.Nested entity recognition is an important task in information extraction,and its application areas involve relation extraction,knowledge graph construction,intelligent question answering,etc.Traditional named entity recognition(NER)is considered a sequence labeling task,which identifies entities by assigning corresponding labels to each character.However,this method cannot directly handle nested entities.Currently,researchers mainly focus on the method based on the span model for nested entity recognition,which identifies entities by generating different potential entity spans for classification.Span-based methods are prone to lose semantic dependencies and granularity information.To address this issue,a planarized span representation method that elevates one-dimensional sequence information to two-dimensional matrix information is proposed.Planarization analyzes sentence structure to obtain sentence representation,further improving entity classification performance.A planarization mapping method is proposed,and the research work of this paper is as follows:(1)A planarized nested entity recognition method based on self-cross encoding is proposed.To address the issue of semantic dependency loss during planarization,we introduce a two-dimensional recurrent neural network that learns the semantic dependencies between spans.First,the text is represented by vectorization.Then,the character vectors are generated into a two-dimensional matrix form of text features through self-cross encoding.Next,the text feature matrix is learned by a two-dimensional recurrent neural network to learn the semantic dependencies between spans.Finally,each span is classified and recognized.Nested named entity recognition is evaluated on five public datasets.Experimental results show that the proposed method can effectively solve nested named entities and learn their semantic dependencies.(2)A multi-granularity planarizing nested entity recognition method is proposed.To address the issue of lacking single-granularity information during planarizing,a multigranularity semantic extractor is proposed.The granularity information in the sequence is obtained through the multi-granularity semantic extractor.First,text is represented as vectors through a pre-trained model.Then,the text vectors are fed into a self-cross encoder to generate a text feature matrix.Meanwhile,the text vectors are also fed into the multi-granularity semantic extractor,which extracts the multi-granularity information in the text through a convolutional neural network.Finally,multiple dilated convolutions are used to extract longrange relationships in the text matrix,which are concatenated with the text feature matrix and then output for classification.The proposed method was evaluated on three public datasets,and the results show that the multi-granularity semantic extractor complements the granularity information and provides good support for nested entity recognition. |