Font Size: a A A

Building A Local SNP Databae And Improving Arithmetic For Aligning Sequence Pair

Posted on:2006-02-17Degree:MasterType:Thesis
Country:ChinaCandidate:T Y XiangFull Text:PDF
GTID:2168360155451130Subject:Biomedical engineering
Abstract/Summary:PDF Full Text Request
Bioinformatics is one of the most active areas in life sciences. In recentyears, the variant databases in bioinformatics have gradually appeared witha scope exploding growth dramatically, and almost covered all areas of lifesciences. Three kinds of the most famous databases are GenBank, EMBLand DDBJ. In the first place, we introduced the data structure about GenBank indetails and analyzed the dictionary of SNP database and the relation amongdata dictionaries. Then a secondary local SNP database was built on basisof information and biological data provided by GenBank. All the data inSNP database were taken from GenBank. Triming the downloaded primarydata was shifted into local database in order to conveniently use for furtherresearch. The database is built on the platform of famous PC WINDOWSoperating system, and use of the most wide-used SQL SERVER 2000database query language operates the database. Besides, a part of the keycodes for building databases, data dictionaries, the relation among thedictionaries and the updating data was given. Finally we queried a part ofdata and gave some notes. In handling and analyzing bioinformation, the arithmetic for aligningsequence pair is one of the most important analyzing methods inbioinformatics. We improved the scoring system in the arithmetic aligningsequence pair in the work. When aligning sequence pair, the fixed scoringsystem in PAM250 is not used in aligning pairs of sequences instead ofdynmicaly varying evolution didtances obtained from result in aligningsequence pair. Then using the results computed in aligning sequence pairreconstructed PAM distance scoring system. Two amino acid sequenceswere rearranged by means of the reconstructed scoring system todynamically vary the outcome aligning sequence pair till the arrangementof two sequences reaches the best. We used the Borland c++ 6.0 to realize the arithmetic involved in thework. The feasibility of arithmetic was verified by the data of Beta-3adrenoreceptor gene family downloaded from the GenBank of NCBI, andthe results was compared with the outcome of FASTA program. The resultsshown that alignment of the two sequences was completed by thedynamically computed evolution distance of the two amino acid sequencesis better to reflect the relative relationship between the two sequences thanby fixed evolution distanc. A key program segment was also provided inthe paper. This article involves two main parts in bioinformatics: building the localSNP database and studying the arithmetic for mining information hidden inbiological data. The first part includes three chapters from one to threechapters; the second part includes two chapters from four to five chapters.In the first place, the innovation in the article lies in building the local SNPsecondary database with low cost. Operating and maintaining database issmple, providing convenience is for scientists and technologists who takeon researching SNP and analyzing biological data. Secondly, we putforward a new way to danamically calculate evolution distance to improvealignment of the two sequences. The improving arithmetic is better than theprior in reflecting relative relationship of the two sequences.
Keywords/Search Tags:bioinformatics, database, single nucleotide polymorphism, alignment, arithmetic, evolution distance
PDF Full Text Request
Related items