Sememes are the smallest and inseparable semantic unit defined by linguists.Any meaning of a word can be represented by a combination of elements in a finite closed sememe set.Sememes mainly come from Hownet knowledge base which contains two millions of Chinese and English words labelled with sememe information and widely applied in Natural Language Processing researches.Sememe information manually maintained by linguists which cost large consumption and make large application of sememe information impossible.Meanwhile,sememe information is only annotated in Chinese and English language area,limiting the usage of sememe information in different language background.Researchers propose sememe prediction and crosslingual sememe prediction task to annotate sememe information in mono-lingual and multi-lingual task.Current works mainly focus on the embedding information of target word or depend on external information like wiki or dictionary description.In terms of cross-language semantic prediction,the prediction is continued by aligning two languages and transforming them into a single language prediction task.In this paper,a novel sememe prediction method redefines from the perspective of inter-word relations.By introducing the HIT Cilin Extended,a local Chinese synonym knowledge base,the word relations in Cilin extended knowledge graph and Hownet knowledge graph are fused to form a new Knowledge Graph called CH-Graph,to provide information about the relationship between words.In terms of semantic prediction,inspired by graph translation models,the sememe prediction task is redefined from the perspective of word relations,and transformed into solving the entity problem corresponding to the tail of the target word in the knowledge graph.And a sememe prediction model KGSP model based on knowledge information is proposed,and the semantic prediction task is completed by using the relationship information in CH-Graph.In terms of cross-language sememe prediction,we refer to the existing model structures of "alignment" and "prediction".In this paper,we propose a knowledge enhanced cross-language word vector alignment method and a series of cross-lingual sememe prediction model methods,CKSP-S,CKSP-V,CKSP-D,which incorporate information from the knowledge graph on both sides of the source language(Chinese)and target language(other languages)respectively,to achieve cross-language sememe prediction.Finally,the experimental results prove that the feasibility and effectiveness of prediction model based on knowledge information,in the sememe prediction tasks and cross-lingual prediction task,enhance the prediction precision of the existing models,the study of the desired effect as the target,and developed a prototype system is used to display and to provide algorithm service. |