Font Size: a A A

Research On Multi-source Data Fusion For The Question And Answer Of Subject Knowledge

Posted on:2021-06-09Degree:MasterType:Thesis
Country:ChinaCandidate:L L LiuFull Text:PDF
GTID:2518306521463274Subject:Information Science
Abstract/Summary:PDF Full Text Request
In the field of life medical,scientific and technological data resources including papers,patents,clinical trials,diseases and drugs have witnessed a "blowout" growth.Massive data resources promotes data-driven knowledge discovery and technological breakthrough,and also brings new challenge for researchers to make efficient use of data and discover in-depth knowledge.Subject knowledge question and answer(SKQA)makes a comprehensive use of natural language processing,knowledge organization,information retrieval,machine learning and other technologies to conduct knowledge mining,knowledge correlation and knowledge reorganization of all kinds of "fragmented" data in the subject field,which can answer the user's question more comprehensively and accurately,and is a typical application of subject knowledge discovery.Multi-source heterogeneous data fusion is one of the key technologies and core research issues of SKQA,and has become the focus and difficulty of current research on discipline knowledge organization and intelligent knowledge discovery in life medicine field.Knowledge Graph(KG)is a new knowledge organization technology for multi-dimensional and fine-grained data fusion of multi-type and multi-source data.In the aspect of data organization,Knowledge graph can realize multi-level,fine-grained,semantically rich organization of knowledge units within data resources,and can also support intelligent retrieval,knowledge question and answer,knowledge mining and other knowledge discovery applications in the service form,which can promote the transformation of information service to knowledge service,has become an important technology of data fusion of science and technology.Adapt to the need of the subject knowledge service,specific to the essential question of multi-source data fusion in SKQA,the paper reviewed the theory,method and key technology of subject multi-source data fusion based on knowledge graph.The method of knowledge entity alignment in the life medical field were mainly studied.Taking the hematopoietic stem cell cancer treatment(HSCCT)field as an empirical study,to construct the HSCCT knowledge graph with the multi-source data fusion,and introduce the process and advantages of HSCCT SKQA.The main work of the paper:(1)Aiming at the core problem of multi-source data fusion in SKQA,a set of efficient method system of knowledge entity alignment in the field of life medicine is proposed.This method is based on the Unified Medical Language System(UMLS)and uses Atom Mapping,Term Mapping,sub-term Mapping and Semantic Type Mapping to achieve domain knowledge entities alignment more comprehensively and accurately.It is proved that the method is better than the method of knowledge entity alignment based on character similarity and semantic similarity.(2)Based on the proposed knowledge entity alignment method,a HSCCT knowledge graph was constructed by fusing papers,patents,diseases,genes and other multi-source data.The knowledge graph includes 14 types of knowledge entities and 39 types of semantic relations,with a total of 498,237 knowledge entity nodes and 2,743,269 relational data.(3)Based on the HSCCT knowledge graph and the Neo4j database,a classification system of HSCCT SKQA is designed.We introduces the query process based on the classification system and summarized the inquiry advantage of SKQA.These advantages confirm the effect of knowledge entity alignment at the application level.The HSCCT SKQA can provide knowledge services such as explicit knowledge.entity and semantic relation inquiry and tacit knowledge question and answer based on knowledge reasoning.Compared with traditional information retrieval service,its answers are more comprehensive,richer and more accurate.In a word,based on the specific demand of SKQA for multi-source heterogeneous data fusion,we proposed a set of efficient knowledge entity alignment method,constructed a HSCCT knowledge graph based on the method,summarized the service advantage of HSCCT SKQA.The knowledge entity alignment method proposed in this paper can more effectively realize the fine-grained,in-depth integration and reuse of multi-source heterogeneous data in the life medicine field,and the HSCCT knowledge graph can support more comprehensive,accurate and intelligent application of SKQA.
Keywords/Search Tags:Multi-source Data fusion, Entity alignment, Relation fusion, Knowledge graph, Knowledge discovery, Knowledge QA, HSCCT
PDF Full Text Request
Related items