Font Size: a A A

Knowledge Graph Construction For Bioinformatics Tools And Generation Of New Entity Embeddings

Posted on:2022-12-07Degree:MasterType:Thesis
Country:ChinaCandidate:H LiFull Text:PDF
GTID:2480306758492084Subject:Computer Software and Application of Computer
Abstract/Summary:PDF Full Text Request
Bioinformatics is the frontier field of life sciences and natural sciences today,one of the main contents of which is to develop and design a series of relevant tools to make various biological data effective to acquire,analyze and manage,providing relevant researchers with convenient data information access ways.With the rapid development of this field in recent years,excellent tools of Bioinformatics have been emerging.At the same time,due to the numerous sub-fields of Bioinformatics,the variety of related tools is complicated,which makes it difficult for people to use and learn.Knowledge graph is expected to solve the problems above.Since Google put forward the relevant concept,knowledge graph has been widely used in different fields to assist data storage,data analysis,decision-making and so on.However,in the field of bioinformatics,no knowledge graph for Bioinformatics tools has emerged yet.Knowledge graph of Bioinformatics tools has a strong practical value,as it can precipitate more professional domain knowledge,and can help search,recommend,and assist to achieve more accurate Q?A system.In order to carry out downstream tasks such as knowledge graph reasoning and knowledge mining,it is necessary to use embeddings to represent knowledge graph.The rapid development of Bioinformaticsrelated tools and software means that the constructed knowledge graph need to be continuously iterated and updated,and new entities will emerge frequently.Therefore,how to effectively represent new entities in the downstream tasks is one of the difficulties in the application of knowledge graph.This paper designs and constructs the Bioinformatics tool knowledge graph by using the related technology,and proposes a new entity embedding generation method NEEGAT for the emerged new entities.The main works of this paper are as follow:(1)Knowledge graph construction and visualizationBuild a domain knowledge graph for Bioinformatics tools.First,using automated technologies and crawler technologies such as selenium and Scrapy,the tools,authors,fields to which the tools belong,papers,journals,keywords,citations,and other information are obtained,aligned,screened,cleaned,deduplicated and denoised.Second,the data is disassembled to form a triple.Finally,a graph database is introduced to visualize the knowledge graph,and finally a huge knowledge graph with nearly40,000 entities and 200,000 triples was formed.(2)Development of new entity embedding generation algorithmConsidering new entities that keep emerging from the dynamically updated knowledge graph,in order to avoid retraining the whole knowledge graph,a new entity embedding representation algorithm NEEGAT based on graph attention network is proposed.The algorithm uses Trans E for pre-training to obtain the overall semantic information of the triples of the tool graph,uses logical attention to introduce the knowledge graph in the form of external knowledge,and uses the multi-head graph attention network to further integrate the multi-dimensional link relationships between neighbor nodes.In addition,a Bio Tools dataset is constructed based on the constructed knowledge graph and an experimental dataset is generated by sampling the two downstream tasks of link prediction and triplet classification to test the method in this paper.The experimental results show that on the link prediction task and triple classification task,the NEEGAT method proposed in this paper has achieved the best overall performance compared with the comparison method,which shows that the algorithm can better solve the new entity embedding generation problem of knowledge graph.
Keywords/Search Tags:Bioinformatics, Knowledge Graph, Knowledge Graph Embedding, Graph Neural Network, Attention Mechanism
PDF Full Text Request
Related items