| Relation extraction is one of the popular research directions in the field of natural language processing.The main purpose is to identify the relation between entities from natural language.It has been widely used in many applications,such as sentiment analysis,knowledge map construction and information retrieval.However,there are many challenges and difficulties in relation extraction,one of the main problems is the long tail problem.The long-tail problem means that many relations appear only a few times in the data set.Compared with the relationship with a higher frequency,the number of samples of the long-tail relationship is very small,and it is difficult to accurately extract and model.The long-tail problem first comes from data imbalance.In natural language,large number of samples of common relations exists,but the number of relations or professional terms in many specific fields appears less frequently,making it difficult to extract rare relationships in a small number of data to form a long-tailed problem.To solve long-tail problem in relation extraction,this paper conducts the following two aspects of research and system implementation on long-tail relation extraction.(1)Long-tail relation extraction based on multi-granularity semantic enhancement.In order to improve the ability of a single classifier for long-tail relations,this paper adopts a knowledge representation based on pre-trained model optimization according to the hierarchical structure of relation labels,by introducing a label-to-sentence attention mechanism with multi-granularity constraints between relations at different levels,to enhance the label hierarchy dependence of single classifiers,further dig out the semantic information of relation labels,alleviate the long-tail problem and improve the performance of long-tail classes in relation extraction without compromising head relation performance.(2)Long-tail relation extraction based on an ensemble method.This paper introduces an ensemble mechanism to train multiple classifiers with diverse features,and uses a routing module to control the selection of results among multiple classifiers,so as to solve the problem that a single classifier is not enough to learn long-tail relationships,and balance The performance between the head relationship class and the long tail relationship class further solves the long tail problem on the relation extraction dataset.On the large-scale benchmark dataset NYT,the method proposed in this paper shows effectiveness and achieves excellent performance not only on long-tail relations,but also on all relations.(3)Realization of an end-to-end long-tail relation extraction system.Based on the above two research contents,this paper adopts B/S architecture,uses React front-end framework,Node.The long-tail relationship extraction system on the terminal has functions such as login registration,relationship extraction,and knowledge map display. |