Font Size: a A A

Research And Implementation Of Domain Knowledge Graph Construction Method Based On Deep Learning

Posted on:2022-09-29Degree:MasterType:Thesis
Country:ChinaCandidate:X M YangFull Text:PDF
GTID:2518306341951649Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of artificial intelligence,the application of knowledge graph can be seen in many fields of intelligent information services,for example intelligent question and answering system,personalized push,and intelligent information retrieval.The knowledge graph helps computers learn human language communication methods,so that computers "think" like humans,and enable various information services to feed back more intelligent answers to users.It can be said that the knowledge graph is the direction of the integration of traditional industries and artificial intelligence,and it is also an essential link in the process of artificial intelligence from research to application.The industry knowledge graph is a knowledge graph constructed and based on industry data for specific industries.Compared with general knowledge graphs,it emphasizes the depth of knowledge.Although in the general field,both the academia and industry area already have large-scale annotated data for training the knowledge graph construction model and the entity and relationship extraction technology has made great progress,in the vertical field scenario there still exist problems such as insufficient amount of annotation corpus data,manual annotation consumes money and energy.And as business changes,the types of entities and relationships will continue to change and update,which makes the existing annotation data cannot be applied to new types of entities and relationships.These difficulties make it very difficult to construct knowledge graphs in vertical fields.In order to solve the problem of difficulty and inefficiency in the construction of the industry knowledge graph,this article mainly studies how to extract the industry knowledge and construct the industry knowledge graph automatically,efficiently and accurately.First,in order to construct an enhanced data set for knowledge extraction model training,this paper proposes an enhanced data generation method based on dictionary and cross-enhancement.Then,for the semi-structured data and unstructured data in the industry product documents,the automatic table knowledge extraction algorithm Bi-LSTM-CRF-SSG based on sequence labeling as well as sub-schema layer generation and the joint entity relation extraction algorithm BERT-PGM based on BERT and probability graph model are designed and implemented respectively.Experiments show that after subsequent fault-tolerant processing,the semi-structured data knowledge extraction algorithm can achieve 99.13%accuracy,and the unstructured data knowledge extraction algorithm can achieve 95.7%accuracy on our data.Finally,this article implements a domain knowledge graph automatic construction system,and conducts system testing from function and performance.The test results show that the function and performance of the system meet the needs of users.
Keywords/Search Tags:Domain knowledge graph, Knowledge extraction, Data augmentation, Sequence labeling, Probability graph model
PDF Full Text Request
Related items