| The exponential growth of network data is a consequence of the immense popularity and swift advancement of mobile Internet.However,most of these massive data are unstructured heterogeneous data,which is difficult to be effectively utilized despite their wealth of value.The use of relation extraction technology enables the automated extraction of entities from unstructured text and their connections,which are then presented in the form of structured triples to effectively extract essential information from natural language text and meet the requirement for knowledge from unstructured data.These structured triples are of great significance and value to artificial intelligence fields such as knowledge graphs,recommendation systems,and automatic question and answer.The rapid advancement of deep learning technology has caused entity-relation extraction techniques based on it to gain much attention and yield remarkable outcomes.These extraction techniques,which are based on deep learning,can be divided into pipeline and joint methods.Pipeline methods separate entity identification and relation classification into two tasks,which are performed sequentially.The joint extraction method shares the coding layer to extract the triadic data at one time.However,most deep learning-based work ignores the problem of overlapping entity relations,and these works usually consider that relations between entities are unique,which is often contrary to the fact.Works to solve the entity-relationship overlap problem usually suffer from some shortcomings.Most works to solve the overlap problem of entity relations usually predict entity-pair relations under the full relation class,while the number of relations in a sentence is much smaller than all the relations that are defined in the dataset,which causes the problem of redundant predictions.In addition most of the work is based on multiple steps,and the wrong prediction in the previous step affects the next step,which causes the problem of error accumulation.To address the above issues,this thesis conducts an intensive analysis and research on overlapping entity relation extraction,and the main work is as follows:(1)This thesis proposes a joint extraction model based on entity role recognition to address the redundancy prediction problem.The model adopts a strategy of low-level feature separation and high-level concept fusion.First,the entities as well as relations existing in the sentence are independently recognized,and these limited entities and relations are used to construct candidate triples,which complete triple extraction by identifying the entity roles within the triples under different relational constraints,effectively solving the overlapping entity problem while greatly reducing redundant prediction.Through experimental comparison,the effectiveness of this model in solving the overlapping entity relation extraction task is demonstrated.(2)This thesis proposes a joint entity relation extraction model based on Span linking to tackle the issue of error accumulation.Multi-granularity coding is used to combine BERT representations with POS(Part Of Speech,POS)information and character information as text representations to enhance feature expression,and then enumerate the candidate Spans that may be entities by enumeration,and use text representations mapped to different dimensional spaces to obtain candidate Span feature expressions as candidate entities.Finally,the correlation between the candidate entities is calculated under a specific relation network to perform triadic prediction.A comparison with the baseline model reveals the efficacy of the technique. |