Knowledge graph is a technical method to describe the correlation between everything in the world by using graph model.It provides an ability to efficiently organize,manage and understand massive information,and plays an increasingly important role in semantic search,intelligent question answering,auxiliary language understanding and other fields.The existing open knowledge graph or domain knowledge graph extension rarely consider the image as the object of knowledge extraction,which makes the knowledge graph mainly based on text,unable to realize visual query based on image semantics.In order to realize the expansion of multi-modal data in multi-modal knowledge graph,this paper takes the field of computer education as an example to design an image semantic association and extension system based on the multi-modal knowledge graph,which can help users realize the association between image semantics and open knowledge graph.The main research contents are as follows:1)An entity link method computing multi-modal information correlation is proposed for computer science.Firstly,YOLOV5 is used to identify the visual entities in the image,and the triples in the text description of the image are extracted and screened by combining the visual entities.Then,a Visual Entity Link Rules(VELR)is proposed for the computer field.In order to effectively complete the link,the image visual entity and the head and tail entities in the filtered triple are used successively to find the best link point.The different link strategies are set for different link points.In addition,the rule makes full use of the image visual entity and the filtered triple to expand the associated entity.2)A multi-modal knowledge representation learning entity linking method based on self-attention gating mechanism is proposed.Firstly,a multi-modal knowledge representation learning model is proposed to integrate multi-modal information features.VGG16 is used to extract the feature vector of the enhanced image,BERT and CNN are used to extract the word level features and phrase level features of the text description.By means of self-attention mechanism and gated neural network,the vectors of different modes are fused to obtain a multi-modal feature vector.Image reference entity and text reference entity are obtained by YOLOV5 and named entity recognition model respectively,and then as many candidate entities as possible are found in the multi-modal knowledge graph by similarity calculation to form the candidate entity set.The similarity between the multimodal feature vectors of the reference entity and the candidate entity is calculated,and the candidate entity with the highest similarity is selected as the correlation point.3)An image semantic association and extension system based on multi-modal knowledge graph is designed.An image semantic association and extension system based on multi-modal knowledge graph is designed by using the two multi-modal entity linking methods proposed above.Users can choose two entity linking technologies independently to associate images to the corresponding nodes of the multi-modal knowledge graph,and then realize the expansion of the nodes after association by combining the text description. |