Protein Structure Classification Algorithms Based On Sequence Similarity

Posted on:2005-10-11

Degree:Doctor

Type:Dissertation

Country:China

Candidate:Y G Li

Full Text:PDF

GTID:1118360185995671

Subject:Computer system architecture

Abstract/Summary:

PDF Full Text Request

An important research topic in bioinformatics is to understand the meaning and function of each protein encoded in the genome. One of the most successful approaches to this problem is via protein classification. It has for long played a central role on how to improve the computing efficiency and reducing the memory requirement on the condition that the results will not be reduced too much. Focusing on this problem, we choose the algorithm and the parallel computer architecture as the central topic of our research.The main contribution of this thesis includes the follows.1) Based on the support vector machine algorithm, piece sequence evolution distance kernel has been proposed. Because each sequence is compared with the 'center' sequence of the family, instead of with every sequence of it, a significant speedup can be achieved. Meanwhile, each part of the two sequences is compared accordingly, insteaded of comparing the two whole sequences, the sensitivity can be guaranteed. The results show that this method is a little more precise than the SVM-pairwise method, which is one of the most accurate methods. More over, on the respect of computational efficiency, it is significantly better than the later, and is about 10 times faster than the later in the experiments of classifying 54 protein families in average.2) Focusing on the parallelism and locality of the architecture of CoSMPs, the main factors that influence the performance are analyzed, and the problems of how to parallelize and optimize applications are investigated. The merits and demerits of the two programming models: the MPI mode and the MPI + SMP directive mode are investigated. Then, methods of how to improvement performance and parallize algorithms on the cluster of SMPs are proposed.3) High performance parallel Smith-Waterman algorithm for protein classification. This method can reduce the space complexity from 0(mn) to 0(m) while nearly double the running time.4) Using the strategy of divide and conquer, a scalable parallel algorithm of...

Keywords/Search Tags:

Sequence alignment, optimalize performance of algorithms, Support Vector Machine

PDF Full Text Request

Related items

1	Comparison Between Algorithms Of Sequence Alignment And Study In Application Of Machine Learning In Sequence Alignment
2	Some Algorithms Research On Support Vector Machines
3	Research On Structure Support Vector Machine Classification Models
4	Research On Some Problesm Of Support Vector Machine Learing Algorithm
5	Research On Support Vector Machine Technology In Biologic Data Analyses
6	Nonlinear System Identification And Control Based On Support Vector Machine
7	Research On Self-tuning Fuzzy Support Vector Machine Algorithms For Shifting Class Center
8	Hybridized Optimization Algorithms Of Swarm Intelligence And Their Application
9	Research On Protein Multiple Sequence Alignment Algorithms And Assessment Of Their Performance
10	Multiuser Detection Algorithms, Power Control Algorithms And DOA Estimation Algorithms Based On SVM Methods