A Knowledge Graph is a structured semantic knowledge base that rapidly describes real-world concepts(entities)and their interrelationships.It enables the rapid response and reasoning of knowledge.With the 2022 Beijing Winter Olympics,there is an increasing amount of information about the Winter Olympics on the Internet,and the public’s demand for Winter Olympics-related knowledge is also on the rise.However,the field of the Winter Olympics faces significant challenges in terms of internet information,including poor information relevance and limited presentation.Furthermore,data from various sources exist independently and have varying complex structures,which restricts the sharing and interaction of information.The advent of the Knowledge Graph offers a novel solution to these issues.Despite the importance of Knowledge Graph,there is a paucity of research on Knowledge Graph in the field of the Winter Olympics.To address this,this paper focuses on the Winter Olympics domain and explores the extraction of knowledge from diverse data sources with varying structures to construct Knowledge Graph in the Winter Olympics domain.The main research components include:(1)In order to construct a high-quality knowledge graph of the Winter Olympics,firstly in-depth research was conducted on the Winter Olympics domain,and an ontology model of the Winter Olympics domain was constructed based on ontology.The data sources for knowledge extraction were identified by analyzing various types of multisource heterogeneous data on the Internet.Under the framework of the domain ontology,semi-structured data knowledge extraction based on encyclopedic websites and the official website of the International Olympic Committee and unstructured text data knowledge extraction of the Winter Olympic Games information was designed and implemented respectively.Finally,the extracted Knowledge Graph Triples are transferred to the Neo4 j graph database after knowledge fusion to complete the construction of the Winter Olympics knowledge graph.(2)The study aims to extract entities and relationships necessary for constructing the Winter Olympics Knowledge Graph from large-scale sports news text data.This dataset contains the most information and poses the greatest challenge due to its size.To address this challenge,the study proposes an annotation method and Knowledge Graph Triples extraction rules based on deep learning.These methods consider the overlapping relationships and entities extracted from sentences in a domain-specific and pre-given relationship type.The proposed annotation approach enables simultaneous annotation of entities and relations,converting the joint entity-relationship extraction task into a sequential annotation problem.A multilayer CRF entity relationship joint extraction model sharing the Ro BERTa-wwm coding layer(Ro BERTa-wwm-Multilayer-CRF,RMC)is then constructed based on this combined with the idea of multi-task classification.The model performs multiple joint sequence recognition tasks simultaneously with a single model,solving the problem of overlapping entity relations and providing a large number of knowledge representations for the construction of the Winter Olympics knowledge graph.(3)A knowledge fusion strategy is developed based on the characteristics of multisource data.A combination of the synonym list judgment method and text similarity calculation under unsupervised learning is used for the fusion of knowledge triads extracted from semi-structured web pages and unstructured information texts.This avoids and reduces the occurrence of repeated triads or incomplete information in the knowledge graph.Finally,to ensure that the knowledge stored in the Neo4 j graph database conforms to the pre-built ontology model,this paper uses custom mapping rules and manually written mapping statements to transfer data from the My SQL database to the Neo4 j graph database to complete the storage and visualization of the knowledge graph.The aforementioned study effectively organizes fragmented and heterogeneous knowledge related to the Winter Olympics from multiple sources using Knowledge Graph as a form of knowledge organization.This approach facilitates the rational use of Winter Olympics data by managers,as well as enables users to quickly and accurately screen useful information and make timely and precise decisions.Furthermore,this study provides crucial data support for the development of subsequent intelligent services in the field of Winter Olympics knowledge. |