Transforming rich data into structured information that is easy to understand and utilize has become a key issue that urgently needs to be addressed in the field of text mining.Entity relationship extraction,as the core task of text mining and information extraction,mainly involves modeling text information,automatically extracting semantic relationships between entity pairs,and extracting effective semantic knowledge.It is precisely one of the best solutions to the above problems.The research results of entity relation extraction are of great significance in text summarization,automatic question answering,machine translation,Semantic Web tagging,knowledge mapping and other fields.The main focus of this article is on entity relationship extraction of Chinese text,mainly exploring the difficulties of error transmission,entity nesting,and relationship overlap in current relationship extraction tasks.Considering that entity extraction is a prerequisite task for entity relationship extraction,correct entity information can help improve the effectiveness of relationship extraction models.This article starts with entity extraction and designs a text annotation method that can solve the problem of complex entity nesting,and based on this,constructs an entity extraction model model.Then,based on the entity extraction model,two entity relationship joint extraction models are proposed by building a unified framework to share vector space and model parameters.The main job responsibilities are as follows:Firstly,a text annotation method was designed and an entity extraction model was proposed based on it: the Span-Extraction Based Entity Extraction Model(SEM).SEM consists of two parts: span extraction layer and entity classification layer.The span extraction layer traverses the token vectors of the text by designing a dynamic sliding window,transforming the semantic vectors in the model from token vectors to span vectors;And three different mapping methods were designed,using three different feature filtering mechanisms to group and map spans into a one-dimensional list,thus achieving the design method of extracting spans with the same features from the text.The entity classification layer determines the type of extracted entities by constructing a two-dimensional matrix of "entity span-entity type".Secondly,a joint entity relationship extraction model based on SEM was designed:the Span-Tagging Based Joint Extraction Model of Entities Relationships(STM).STM adopts the span extraction layer of SEM in text annotation,and in order to solve the problem of relationship overlap,STM designs a new set of relationship type labels by processing them.Then,using the new relationship labels,it constructs a two-dimensional matrix of "entity span-relationship label".Finally,the model unifies the relationship extraction task and entity extraction task in one framework,constructing a unified joint two-dimensional matrix.The relationship extraction and entity extraction are synchronized,and the results of entity extraction are used as auxiliary factors for relationship extraction to participate in training.Finally,a joint entity relationship extraction model based on SEM and STM was designed: the Span-Selection Based Joint Extraction Model of Entity Relationship(SSM).SSM improves the span extraction layer of SEM in text annotation,unifies the subject and object into one dimension,and designs a set of entity type labels.At the same time,SSM also models the relationship extraction task and entity extraction task in a unified framework,achieving the design of joint extraction.The effectiveness of the model was demonstrated through testing on Chinese datasets for entity recognition and relationship extraction. |