Font Size: a A A

Construction Of Workflow-based Omics Data Annotation Platform

Posted on:2012-07-31Degree:MasterType:Thesis
Country:ChinaCandidate:B MinFull Text:PDF
GTID:2178330335459130Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the implementation of the Human Genome Project, the rapid development of high throughput sequencing and biochip technology, a large amount of biological data has been produced in life science. These data cannot be useful information until being analyzed and annotated. In order to efficiently store and manage so many data resources, plenty of bioinformatics databases have been developed. It becomes an important bioinformatics research area to use these data to analyze and annotate the new biological data, and then predict their biological significance. However, single database can not cover all kinds of information. To obtain more explicit information, the possible method is to integrate multiple diverse data and tools to complete the annotation work. For most of current bioinformatics databases and software, there is a big heterogeneity between them, which lead to a great difficulty for integrating. In recent years, more and more bioinformatics database system provides lots of web services resources which can be combined into a complex application. Web service is based on the standard HTTP and XML; it can solve the heterogeneity problem, and facilitate building a distributed integrated environmental. Therefore, it is urgent and important to develop a bioinformatics platform to integrate data, web services and software algorithms resources to jointly coordinate to complete a work. Workflow technology has become a general mechanism to solve such a problem.This paper designs and implements an omics data annotation workflow platform based on gene ontology and web service, through the analysis of the current bioinformatics platform. The platform can integrate the dominant information of bioinformatics databases with their web service for a specific biological issue. Firstly, we design and build a local GO annotation database as the basic annotation data source of system, GO can easily get more information from other database with web service, for GO is associated with external database through the mapping files. Secondly, after analyzing web service client software of bioinformatics community, we developed a more general web services management client based on the open source technology framework in commercial community. Thirdly, to meet the needs of platform, we also design and develop a lightweight workflow engine for bioinformatics computing and a friendly workflow designer. Finally, we carry out some simple annotation task to make a function test for the platform.This paper makes some innovations. The platform solves the problem that web service is hard to be dynamically bound and invoked due to the difficulty for dynamically creating complex type object, through directly assembling the SOAP message. Besides, the platform applies a workflow mechanism to annotate biological data, which can get as much diverse information as possible without construction of many local databases.Finally,we developed a data-driven lightweight bioinformatics workflow engine program,the data-oriented way can facilitate managing and tracking data.
Keywords/Search Tags:Bioinformatics, Gene ontology, Web service, Workflow, Gene Annotation
PDF Full Text Request
Related items