Font Size: a A A

Study On Computation Method Of Genes Semantic Similarity And Its Application

Posted on:2015-02-22Degree:MasterType:Thesis
Country:ChinaCandidate:B H WuFull Text:PDF
GTID:2298330422471665Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the continuous development of biological technology,especially the DNAmicroarray technology continues to mature,a large number of gene expression data aregenerated,predict gene function by mining the expression data becomes a hot topic.Itgenerally considers that similar gene expression profiles tend to have similarfunctions.However,numerous studies have found that genes with similar functions donot necessarily share similar expression profiles.On the contrary, gene which hasdifferent functions may have similar expression profile.Based on this, a lot of othermethods have been proposed to predict gene function,among them the using of the GeneOntology (GO) project which initiated by the Gene Ontology Consortium to predicateand analysis gene function has been widely accepted.GO as a semantic dictionary for avariety of biological species describes and qualifies the gene and proteinfunctions.Usually a gene or gene product is annotated by one or more GO terms.It hasbecome an important research direction of predicting or measuring gene function bycalculating the similarity of the annotated genes.The thesis discusses the research of semantic similarity between genes and geneterms respectively and introduces the classification of the commonly used methods ofsimilarity measurement.Typical methods of each class of similarity measurement arealso introduced.Their pros and cons are also described in detail.This thesis aimed to doresearch in the aspect of term-based method. Details as follows:1. Semantic similarity between gene termsTo address the drawbacks in existing methods, we introduce the concept of edgeweighting which consider the depth and density of terms in GO into the calculation ofthe semantic distance. Then combine it with two other factors of semantic level andsemantic density, the thesis proposesa new semantic similarity calculation methodwhich based on edge weighting between gene terms. Through the experiment, andcomparative analysis with other methods, the results show a higher precision obtainedby our method.2. Semantic similarity between genesBased on the study of semantic similarity calculation between gene terms, thethesis improves the semantic similarity calculation method between genes proposed byWang et al through introducing the concept of gene term concept weighting. A performance study with some sets of genes from saccharomyces genome database (SGD)has demonstrated that our method is effective.3.Disease genes predictionBased on the study of semantic similarity calculation between genes, the thesisimproves the correlation calculation method between genes and disease by introducingthe concept of gene term concept contribution. By analyzing the similarity betweencandidate genes and disease,sorting the scoring of candidate genes and comparing withother existing methods, the results show our method has higher precision and caneffectively help the selection of candidate genes in interested chromosome regions.
Keywords/Search Tags:GO, term, semantic similarity, edge weighting, candidate genes
PDF Full Text Request
Related items