Font Size: a A A

An Approach For Measuring Semantic Similarity Between GO Terms

Posted on:2009-02-20Degree:MasterType:Thesis
Country:ChinaCandidate:D LiuFull Text:PDF
GTID:2178360245454496Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
The study of similarity includes mostly structural similarity and semantic similarity. The study on structural similarity is pervasive comparatively in the past, and the study of semantic similarity attracts more and more attention till recent years.Owing to historical reasons, the data source of biology is very complicated. For reducing or eliminating confusion between concepts and terms, Gene Ontology consortium researched and developed a large semantic dictionary ---- GO (Gene Ontology). The reseach of similarity plays an important role in many study fields. One important aspect of GO application is measuring semantic similarity between GO terms. It is generally believed that if two gene products are similar, we would except that their genetic expressions are similar, and that they are similarly annotated in the GO. Thus, we may compare similarity of function levels of two gene products against their corresponding similarity of annotation in the GO. So measuring semantic similarity between GO terms is an important approach to resolve the problem of semantic heterogeneity in biological data integration.At first, we present the background of GO and the study situation of semantic similarity in this paper. Then we analyze several available approaches for measuring semantic similarity between GO terms, and propose a subgraph-based approach against one of the most commonly used approaches. And then, we design an algorithm and testify it upon a part of GO graph. Finally, a summary of this approach is given, and we discuss more broad application space for it.The new approach proposed in this paper is an approach which combines information content-based and semantic distance-based methods. It makes semantic similarity measure between GO terms more accurate. If this approach can be used to GO database, it will be promising to search similar or related proteins more accurately, and will lay a good foundation for the relevant study and application of bioinformation.
Keywords/Search Tags:GO, Semantic similarity, information content, semantic distance
PDF Full Text Request
Related items