The future trend of education is education informatization and education intelligence.The construction of educational knowledge graph is an important tool and means to realize educational informatization.Knowledge graph constructed manually would be laborintensive and time-consuming.The automatically constructed knowledge graph tend to be of low quality.The previous semi-automatic construction methods are aimed at building knowledge graph in subject areas.Due to the wide range of subject areas which includes a lot of concepts,construction requires a lot of manpower.This paper starts from the curriculum to build the subject knowledge graph from the bottom up by proposing a set of methods which includes data collection,subject knowledge fusion and develops a system that realizes knowledge sharing and supports teachers’ teaching and answering questions.The research work of this topic mainly includes the following points:(1)Aiming at the characteristics of computer science,this paper proposes a computer science knowledge graph construction scheme that combines multiple technologies.It is divided into three parts: data acquisition,computer knowledge ontology construction and knowledge fusion.First,using a crawler framework to obtain data related to computer science from complex and diverse networks.Second,designing a data cleaning plan based on the characteristics of different data sources,using structured data to supervise and build a preliminary framework for computer science ontology.In order to enrich the knowledge ontology,extracting the attribute information of computer subject concepts from the ”Own Think”.To reduce the labor cost of obtaining the description information of subject concepts from unstructured data,a template-based labeling strategy TB-KPALS is proposed,combined with active learning labeling data,and experiments show that this method can effectively reduce manual labeling.(2)The knowledge fusion method proposed by this paper,which is divided into data preprocessing,knowledge point creation constraints,and automatic fusion of knowledge points under subjects.First,the data from the MOOC website is processed accordingly so that no duplicate values appear.Second,when the knowledge points are imported into the database,the system will automatically judge the weight,and insert the knowledge points into the correct path using the constraints algorithm which named KPCA.Finally,for knowledge points with semantic repetition,this paper proposes a fusion algorithm named SFAJED which uses the improved Jaccard combined with edit distance to calculate the similarity between knowledge points.When the system determines that similar nodes are found,it will automatically transfer the relationship between nodes and re-establish the knowledge network under the subject.(3)This paper designs and develops a knowledge sharing platform,which is connected to the teaching resource platform of computer foundation in Shanghai higher education institutions and the answer system of Metasequoia Encyclopedia.Teachers can easily and quickly create a syllabus in the system and apply it in the question-and-answer system to support teachers’ teaching and question-answering activities.The system also provides data support and technical support for higher-level applications in the future. |