Font Size: a A A

Analysis And Integration Of Genome Browser Underlying Data

Posted on:2014-09-24Degree:MasterType:Thesis
Country:ChinaCandidate:Z X WangFull Text:PDF
GTID:2250330422950615Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the continuous development of the sequencing techniques and driven bythe1000Genomes Project, massive personal genome data has been released and theresults of genome-wide association study is constantly being published. It is anefficient method that genome browser which has become a widely used toolcombining various bioinformatics databases to analyze the genome data. It is animportant research topic how to analysis genomic data in the impact on theindividual, especially on individual disease, in multi-faceted approaches. For fillingthe gaps in integration of most genome browsers and disease/drug-related databases,in this paper we formulated five standards to filter the disease/drug-relateddatabases and integrated them by changing them into GDF format which wascreated by extending GFF3format, and evaluated the credibility of entries byweighted scoring and information content, and compared the results of the twomethods. The data was loaded into a genome browser based on B/S structure by anapproach similar to data warehouse. We realized different FileReader to handledifferent files and used XML as data transmission format. We found the time thatsystem needed to load file was short by partly testing the system.The major results include,(i)summarizing the underlying data type of existinggenome browsers, enriching the displaying content by integrating thedisease/drug-related data, forming a more comprehensive knowledge systemproviding researchers with fully and convenient reference,(ii)creating the GDFformat for storing the disease/drug-related data which is useful to integrate the databetween different databases,(iii)using the weighted scoring and information contentmethods to assess the credibility of integrated data to ensure the accuracy of thegenome browser loading data,(iv)improving the data warehouse approach because ofthe diversity of genomic data format, using different FileReader as the interface toimprove the file processing speed.
Keywords/Search Tags:Genome Browser, Integration of Database, Bioinformatics, Weighted Scoring, Information Content
PDF Full Text Request
Related items