| As a key technology of artificial intelligence,knowledge graph supports question answering,semantic search,accurate matching and other applications with its powerful knowledge representation.At present,various industries,such as innovation and entrepreneurship,medicine,finance and so on,are constructing knowledge graphs belonging to their own fields,and the core of which is the extraction of entity and relation.Therefore,it is of great significance to study how to extract the required entities and relations from massive data to construct knowledge graph.Due to the particularity of Chinese,most advanced English entity and relation extraction methods cannot be directly transferred and used to Chinese,which is prone to error propagation.The introduction of external lexicon can effectively solve this problem.This paper studies lexicon enhanced entity and relation extraction methods for the characteristics of Chinese.Details are as follows:(1)This paper summarizes and analyzes the existing advanced Chinese entity recognition methods.Most of the current lexicon enhanced methods focus on how to introduce semantic information from external lexicon,ignoring the influence of semantic information on different stages of entity recognition task.An entity recognition method of fusion representation and decision enhancement was proposed,that is,the information of matched words from lexicon was introduced in the representation and decision stage at the same time,so as to enhance the performance of the model and solve the problems of insufficient semantic information of characters and ignoring the interaction between characters and matched words.Experimental results show that the performance of the method is better than the baselines,and the method using lexicon in two stages is more effective for Chinese entity recognition than the methods without using lexicon or the methods using lexicon in one stage.(2)Based on the above research,the datasets with limited performance improvement of the model are analyzed.It is found that for colloquial serious datasets,while improving the performance of the model through lexicon information,a large number of noisy matched words are introduced,which may interfere with the recognition performance.This problem is further analyzed and discussed,and an entity recognition method of multi-task learning and effective word selection is proposed.One task is to select words,three learning strategies based on deep semantics are designed to train the word scoring model,filter noisy words and select more helpful matched words.The other task is to integrate more helpful matched words into entity recognition task.The proposed method is validated on three datasets,and the experimental results show that the performance of the proposed method is better than the baselines,and the performance improvement is more obvious on the colloquial serious datasets.(3)This paper summarizes and analyzes the existing advanced Chinese relation extraction methods.Aiming at the problems of lack of local semantic information and slow calculation speed in the existing advanced lexicon enhanced Chinese relation extraction methods,a relation extraction method based on multi-head attention and lexicon enhancement is proposed,that is,lexicon information is introduced in the representation and coding stage at the same time,and multi-head attention is used to fuse characters and matched words to make the model automatically select the semantics of polysemous words.The proposed method is validated on two datasets,and the experimental results show that the proposed method has slightly higher performance and faster computation speed than the advanced lexicon enhanced relation extraction method. |