Font Size: a A A

The Research And Application Of The Semantic Similarity Measures Between Genes Based On Gene Ontology

Posted on:2011-01-24Degree:MasterType:Thesis
Country:ChinaCandidate:A E LianFull Text:PDF
GTID:2178360302992215Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In the genomic era, exploring the functional relationship among genes is still one of the main challenges that we are facing today. The high throughput microarray technology appears to fill this gap. The massive gene expression data provides us with unique opportunities to analyze the functional and regulatory relationships among genes. It used to be widely believed that genes with similar expression profiles had similar functions in cells; functions of unknown genes can be predicted from their expression-similarity to known genes,but now biologists find that genes with similar functions may not have been exposed to sufficient perturbations for their expression similarities to be revealed. Therefore, researchers interested in functionally related genes always hope to improve the accuracy of the results beyond the boundaries of currently available expression data. The emergence of Gene Ontology (GO) makes this idea become possible. Now, using GO annotations to define gene semantic similarities and then determining gene functional relevance are becoming increasingly common.The thesis discussed the research status quo on domestic and overseas for gene semantic similarity measures based on GO annotations comprehensively; the typical models of four general approaches (set-based, graph-based, vector-based and term-based) and their advantages and disadvantages were described in detail. This thesis aimed to do research in the aspect of term-based method, so the term-based models were mainly introduced, e.g. Resnik's, Lin's, Jiang and Conrath's, Combine and Wang's methods.To address the drawbacks in existing methods (especially Wang's), we devised a new algorithm to determine the semantic type value for"is_a"and"part_of", in turn, proposed a new method of semantic similarity between terms based on the full semantic paths in the Gene Ontology, which takes into consideration the semantic impact of all the paths from term to root. At last, we verified our measure from three different points. As a result, our measure better describes the information contained in annotations associated with gene products and as a result is better suited to characterizing and classifying gene products through their annotations.The study on gene regulatory networks is a hot topic problem on functional genomics; they reveal complicated life phenomena from the regulating mechanism among genes. By researching of gene regulatory networks, the thesis attempted to use GO annotation information instead of gene expression profiles, utilized FPS we proposed to measure the functional relationships among genes, then combined the MCP (Maximum Clique Problem) in graph theory to construct gene regulatory networks, the results showed our predictions had certain reliabilities. So our method can provide useful reference information for gene regulatory networks to analyze and research.The measurement of gene semantic similarity is very difficult, but important. With the rapid development of the computer technology and the ever-increasing of GO annotations, breakthrough progress will be made in measuring the semantic similarity between genes. Investigating the functional similarity between genes, exploring gene co-regulation and predicting functions of unknown genes from their semantic similarity based on GO annotations to known genes, not only avoid the difficulty of collecting massive gene expression data, the accurate semantic similarity measurement methods can improve the efficiency of gene function research greatly, , which is of reference and guiding significance for biologists to do research on gene functions and related research.
Keywords/Search Tags:GO, semantic similarity, term, biochemical pathways, semantic path
PDF Full Text Request
Related items