| With the development of education informatization,it is particularly important to build a highquality educational knowledge graph.Entity alignment is a key part of knowledge graph construction,which can effectively integrate knowledge from different graphs to form a large-scale,standard and unified graph to better serve the downstream application.However,the namespaces of different knowledge graphs are quite different,and this heterogeneity makes the task of entity alignment a great challenge.Most of the existing entity alignment methods only consider the relational triples of the knowledge graph,and perform iterative training based on knowledge representation technology to obtain the corresponding vector space.In order to synthesize different vector spaces into one space,traditional entity alignment methods use pre-aligned entity pairs as seeds for constraint training.This method requires manual participation,is inefficient,and does not fully utilize the attribute information in the knowledge graph.For educational knowledge graphs with richer attribute information,how to utilize attribute information is the key to entity alignment.In addition,the knowledge in the educational knowledge graph basically comes from the Internet,there is a great hidden danger of knowledge quality,and wrong knowledge in education will lead to immeasurable consequences,so how to ensure the quality of the educational knowledge graph is also a major difficulty.Based on the above two difficulties,thesis conducts in-depth research on the entity alignment method and evaluation method of educational knowledge graph.The main work is divided into the following three parts:(1)In thesis,we propose a new method to jointly use relation triples and attribute triples for vector representation of entities,and uses pre-trained language model and knowledge representation technology for iterative training.For relational triples,thesis transfers the powerful semantic representation capabilities of BERT(Bidirectional Encoder Representation from Transformers)to the initialization stage of the Trans E model,and performs iterative training of Trans E based on the original vector space to improve the efficiency and accuracy of entity vector representation at the structural level.In addition,the attribute triplet is represented by vector based on BERT,and the entity vector at the attribute level is obtained according to the translation model idea and the TF-IDF weight allocation strategy.Finally,the joint is obtained to obtain the entity vector.(2)According to the quality dimension of knowledge graph and the process of entity alignment,thesis selects two indicators of entity redundancy rate and information missing rate to feedback entity quality,and proposes a multi-feature entity evaluation method based on BERT,which comprehensively utilizes the multi-dimensional information of entities.Firstly,the clustering algorithm is used to reduce the scale of the calculated entity pairs,and then the semantic similarity of the entities is calculated according to the multi-faceted information of the entities to obtain redundant entity pairs and simplify the structure of the knowledge graph.In addition,with the help of the open knowledge graph platform,the information is compared through multi-level filters to obtain the information set of missing entities,which can effectively complement knowledge.(3)In thesis,an entity alignment and quality evaluation platform for educational knowledge graphs is built,which integrates the entity alignment method and evaluation method proposed above.In order to have the characteristics of high availability and high throughput,the internal components of the platform are called through the idea of microservices,and all components are deployed in a cluster. |