Font Size: a A A

Co-expression With The Shortest Path Passing Predict The Arabidopsis Gene Function

Posted on:2012-06-03Degree:MasterType:Thesis
Country:ChinaCandidate:F L ShiFull Text:PDF
GTID:2190330335480285Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
At present, the common method of predicting the functions of new genes in the international area is to make sequences alignment between the genes with unknown functions and the genes with known functions in sequence database and find the sequences having high similarity with the genes with unknown function. Finally, the functions of unknown genes is predicted through genes with known functions that are highly similar with the unknown genes in sequences. However, this approach has the disadvantages of relaying manual operation and low predicting accuracy. The advent of Gene Ontology reduced the disadvantages of the above approach, which provides a set of standards for gene functional annotation. Gene Ontology offers a semantic framework agreement for the storage, retrieval and analysis of biological data, thus setting a fundament for the interactive operations and mutually understanding the contents among the different database systems. Therefore, in recent years, gene functional annotation of genomics has become a major research topic. Functional annotation of genes is of great practical significance to reveal the true meaning of human life, analysis and prevention of disease and design of new drugs.Firstly, this paper reviewed the current methods of calculating genes functional annotation, and simply refer several used methods for genes functional annotation; secondly, described the shortest path method by detail; finally, in the validation of experimental method ,this paper used maximum clique algorithm and K-means clustering methods to analyze the same datas ,with measuring the gene pairs semantic similarity to compare the three different methods. The results has proved that the shortest path method have a higher reliability.Currently, the common approach on gene annotation is gene clustering based on the similarity of the expression model. These approaches, using expression data of gene chips to analyze gene function, usually assume that the genes have similar function when they have similar expression structure. In addition, the functions of unknown genes can be deduced by the known genes having the similar expression structure. However, in fact, the genes with similar expression structure might not always have similar function. In order to determine the function relationships between genes, the authors proposed an approach different from clustering methods.The genes in the same metabolic pathway are constructed into an undirected weighted graph, and then find the shortest path in the graph. Based on statistical analysis of genes with known functions, it was found that the genes with similar functions had high expression correlation. Therefore, for the genes in the same shortest, the function of unknown genes can be predicted through the known genes if high expression correlation exists among them.. By analysis of Arabidopsis anther gene in the same metabolic pathway and referencing Arabidopsis existing Go annotation, the authors predicted the functions of part of another unknown genes and again verified the feasibility of predicting gene functions by the method of calculating the shortest path in the metabolic pathway.Arabidopsis is a good material for genetic studying and it is one of the popular research objects to molecular biology and bioinformatics. But at present, there are only 40% of the genes annotated in Arabidopsis Gene Ontology database. As a result, it is very important to perfect Arabidopsis gene ontology database and make annotation for more unknown genes. This work is very meaningful. This paper provides a valuable tool for improvement of Arabidopsis Gene Ontology database, which is of reference and guiding significance for biologists to do further study of Arabidopsis genes and Genetics.
Keywords/Search Tags:shortest-path, pathway, co-expression, gene function, Arabidopsis thaliana
PDF Full Text Request
Related items