Font Size: a A A

Multiple Keywords Associated Search Based On Biological Network Algorithm Research

Posted on:2018-02-11Degree:MasterType:Thesis
Country:ChinaCandidate:L J ZhuFull Text:PDF
GTID:2348330533969239Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the growing of biomedical data,the search demand of large biomedical data also has specific semantic requirements with the different biological problems.Traditional search engines such as Google,Baidu and other common search engines are not able to use the biological semantic association effectively,search for biomedical data resources and information that meet the semantic characteristics of user data needs,and provide efficient biomedical data search.National 863 project of "the Key Technology Research and Development of Biological Data Expression Index,Search and Storage Access" is to address this issue with proposing solution.The goal of this project is to develop a biomedical association search engine in integrated biomedical large data across multiple data sources by using biological semantics.For large data association search engine construction about large biological data,multi-keyword association search algorithm research based on large scale biological network is an important and indispensable part.In recent years,such as keyword search based on RDF format on graph model,and keyword map matching based on graph,keyword search based on graph,there are corresponding algorithms for these questions,but if these algorithms are applied to biological network,there will be not applicable,making the query results to some extent lack of biological semantic information,the search will not take into account the implied structure information of the network.And when the size of the biological network and the set size of query keywords are bigger,the time efficiency of the algorithm will be a bottleneck.In order to solve these problems,a hierarchical Steiner tree algorithm will be designed and implemented.In this paper,the hierarchical Steiner tree algorithm is proposed to solve the multi-keyword association search problem based on the biological network,and the optimal subnetwork of the multi-keyword association search will be found in the biological network.It is similar to the Steiner tree problem when the multi-keyword association search problem on the network is generalized to the mathematical problem,but considering that the Steiner tree problem is an NP-hard problem and the size of the biological network scale and keywords scale to be applied in the algorithm,as well as time efficiency issues.When we use Steiner tree to solve the problem,we choose to use hierarchical clustering algorithm to cluster the biological networks,in order to achieve the multi-keyword search in control of biological network size,and control the size of the terminal nodes while running the Steiner tree algorithm.Our approach is mainly composed of six parts: hierarchical cluster analysis of biological network,hierarchical clustering tree segment,hierarchical hypergraph construction,node importance calculation,Dij kstra's-Steiner algorithm and hierarchical Steiner tree reconstruction.The results of experiments show that,while maintaining the similarity with the Dijkstra's-Steiner Steiner tree algorithm to find subnetworks,the correlation center points in the biol ogical network are emphasized,and the time efficiency is improved highly.
Keywords/Search Tags:steiner tree, hierarchical cluster, biological network, terminal nodes, hypergraph
PDF Full Text Request
Related items