Font Size: a A A

Research On The Key Technology Of Metadata-based Integration For Proteomics Data Resources And The Development Of The Application Platform

Posted on:2009-11-13Degree:MasterType:Thesis
Country:ChinaCandidate:Y LinFull Text:PDF
GTID:2178360278456786Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The 21st century is the age of life science and informatics. In conjunction with the rapid progress of biotechnologies and the human genome project, the biological data has growth exponentially. How to composite, integrate, share and analyze these jillion biological data, has profound theoretical merit and practical significance. However, owing to the inherent complexity and heterogeneity of storage, structure, semantic and accession, the traditional integration of bioinformatics faces a number of problems and challenges.To begin with, the current integration approaches for biological data and their disadvantages are analyzed in this thesis. At the same time, we describe the metadata, bio-ontologies and how they can be used to support the integration of biological data. Then, we establish the common metamodel and ontology base in integration platform. Further more, by combining the traditional mediator-based integration approach with the ontological knowledge, MOBIB frame under the distributed environments based on metadata is presented. After that, metadata automatic extraction, transformation and loading tool RschemaETS is discussed and designed. More specially, RSchemaETS with its excellent expansibility and reusability can be transported to other DBMS to extract, transfer and load database schema metadata automatically because of the high cohesion and low coupling between its models. The realization method based on JavaCC can simplify the implement and make the programmer only need to concern with the metadata in the BNF formula of SQL and its processing logic. When the syntax of SQL is changed, only a few modification is needed, which guaranteed the backward compatibility of the tool. At last, combined with the characters of ontology query languages, we define a query language in MOBIB platform that supports most queries that biologists needed and analyzed the processing flow, structure of the query intepretator and executor. MOBIB framework can answer user queries against distributed, semantically heterogeneous data sources without the need for a centralized data warehouse or a common global ontology.The structural metadata and semantic metadata are used in all steps of multi-sources query in our frame, in order to solve the heterogeneity of structure, semantic and terminology. The proposed method is expected to be applied to several biological fields by introduce the corresponding ontologies. At present, this method has been applied to the integration of human liver proteome data and introduces the Gene Ontology to solve the semantic heterogeneity.
Keywords/Search Tags:Metadata, CWM, Bio-ontology, Bioinformatics Data Integration, Metadata Automatic Extraction and Transformation, Query Interpreter, Query Expander
PDF Full Text Request
Related items