Building A Local SNP Databae And Improving Arithmetic For Aligning Sequence Pair

Posted on:2006-02-17

Degree:Master

Type:Thesis

Country:China

Candidate:T Y Xiang

Full Text:PDF

GTID:2168360155451130

Subject:Biomedical engineering

Abstract/Summary:

PDF Full Text Request

Bioinformatics is one of the most active areas in life sciences. In recentyears, the variant databases in bioinformatics have gradually appeared witha scope exploding growth dramatically, and almost covered all areas of lifesciences. Three kinds of the most famous databases are GenBank, EMBLand DDBJ. In the first place, we introduced the data structure about GenBank indetails and analyzed the dictionary of SNP database and the relation amongdata dictionaries. Then a secondary local SNP database was built on basisof information and biological data provided by GenBank. All the data inSNP database were taken from GenBank. Triming the downloaded primarydata was shifted into local database in order to conveniently use for furtherresearch. The database is built on the platform of famous PC WINDOWSoperating system, and use of the most wide-used SQL SERVER 2000database query language operates the database. Besides, a part of the keycodes for building databases, data dictionaries, the relation among thedictionaries and the updating data was given. Finally we queried a part ofdata and gave some notes. In handling and analyzing bioinformation, the arithmetic for aligningsequence pair is one of the most important analyzing methods inbioinformatics. We improved the scoring system in the arithmetic aligningsequence pair in the work. When aligning sequence pair, the fixed scoringsystem in PAM250 is not used in aligning pairs of sequences instead ofdynmicaly varying evolution didtances obtained from result in aligningsequence pair. Then using the results computed in aligning sequence pairreconstructed PAM distance scoring system. Two amino acid sequenceswere rearranged by means of the reconstructed scoring system todynamically vary the outcome aligning sequence pair till the arrangementof two sequences reaches the best. We used the Borland c++ 6.0 to realize the arithmetic involved in thework. The feasibility of arithmetic was verified by the data of Beta-3adrenoreceptor gene family downloaded from the GenBank of NCBI, andthe results was compared with the outcome of FASTA program. The resultsshown that alignment of the two sequences was completed by thedynamically computed evolution distance of the two amino acid sequencesis better to reflect the relative relationship between the two sequences thanby fixed evolution distanc. A key program segment was also provided inthe paper. This article involves two main parts in bioinformatics: building the localSNP database and studying the arithmetic for mining information hidden inbiological data. The first part includes three chapters from one to threechapters; the second part includes two chapters from four to five chapters.In the first place, the innovation in the article lies in building the local SNPsecondary database with low cost. Operating and maintaining database issmple, providing convenience is for scientists and technologists who takeon researching SNP and analyzing biological data. Secondly, we putforward a new way to danamically calculate evolution distance to improvealignment of the two sequences. The improving arithmetic is better than theprior in reflecting relative relationship of the two sequences.

Keywords/Search Tags:

bioinformatics, database, single nucleotide polymorphism, alignment, arithmetic, evolution distance

PDF Full Text Request

Related items

1	The Development And Performance Optimization Of Human SNP Data Query Platform
2	Design And Implementationon Single Nucleotide Polymorphisms Identification Software
3	Studies On Single Nucleotide Polymorphism Biosensing Based On Excited-state Intramolecular Proton Transfer Probe
4	Research On Informative SNP Selection Method Based On Greed Algorithm
5	Research On Multiple Sequence Alignment Algorithms In Bioinformatics
6	Research On Tag SNP Selection Method Based On Bionic Algorithm
7	Feature Selection Algorithms For High-throughput Data
8	Algorithms For Haplotype Analysis
9	Exploring New Methods Based On Evolution Algorithm And Quantum Computing For Multiple Sequence Alignment
10	Rice Genome Polymorphism Database Creation And Its Auxiliary Systems Design