Font Size: a A A

Genomic Sequence Analysis Based On Statistical Features

Posted on:2006-03-12Degree:MasterType:Thesis
Country:ChinaCandidate:D JiaoFull Text:PDF
GTID:2120360212482873Subject:Biomedical engineering
Abstract/Summary:PDF Full Text Request
With the arrival of the post-genome era, researchers begin to develop various tools on biological databases in order to analyze huge amount of biological data and turn it into knowledge.The main goal of sequence feature analysis is to compare the sequences. However, the traditional method of alignment restricted by the algorithm itself, is not satisfactory in proceed long sequences because of its poor efficiency. We deal with a statistical analysis of genomic sequences, which are the most abundance biological data in databases today and are able to reflect the essence of evolution, and extracted the features of the local information of them, and then performed genomic sequence analysis. Thus, we not only solved the problem of large-scale computation of long sequences, but also obtained abundant data resources.In order to realize fast and effective search of similar sequences in the scale of genomes, we, in this article, designed a genomic sequence database upon features. It uses programs to compute features of genomic sequences and stores these feature values and the sequences into the database. This searching scheme enables us to find out these sequences similar in function and structure but not only similar in base arrangement in a short period of time.With the help of our Genomic Sequence Feature Database, we mainly selected the Base-Base Correlation feature (BBC) to analyze genomic sequences. Some interesting phenomena are found out. DNA sequences among the same genome usually have similar features; human genome has some relatives in the genomes of mouse and rice with similar features; there are some special segments inside human genome with extraordinary features which makes them more close to other species.Horizontal Gene Transfer (HGT) can be regarded as one of the most important factor in the evolution. We use BBC feature to scan some prokaryotic genomes, and find some regions with weird feature values, which may be regarded as potential horizontal genes. Compared with other methods to discover HG, it has been testified that BBC feature can be applied as a useful standard to detectHGT on bacterial genome sequences.Searching similar sequences in the scale of genomes and analyzing genomic sequences with features help us study the relationship among species, and perform evolutionary and phylogenetic analysis.
Keywords/Search Tags:Genome, Sequence, Database, Feature Analysis, HGT
PDF Full Text Request
Related items