Font Size: a A A

The Research On Protein Sequence Feature Extraction And Its Application On Protein Function Prediction

Posted on:2011-05-04Degree:MasterType:Thesis
Country:ChinaCandidate:W F JiaFull Text:PDF
GTID:2120360308469128Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Feature extraction and classification are the core steps of protein function prediction. Protein function identification is useful to clarify the mechanisms of life change under the physiological or pathological condition. In addition, it plays an important part in disease prevention and drug development.In the post-genomic era, with the continual development of bioinformatics and the accumulation of related data, using scientific computing methods to predict protein functions has become an important research topic. Consequently feature extraction and classification have become grand issues of modern bioinformatics, too.This thesis focuses on feature extraction algorithms and classification algorithms of protein function prediction. This article mainly includes the following parts;1.This thesis put forward a new protein sequence feature extraction algorithm.Breaking the existing BLAST sequence alignment based feature extraction mode. According the alignment result of B12Seq, we achieve E-Value sequence and Score sequence used to evaluate the similarities of similar segments. Then we develop a novel algorithm for feature extraction based on the meaning of the E-Value, Score and their sequence composition. Compared with existing BLAST sequence alignment algorithm, the new one can extract more comprehensive and accurate sequence features, and has higher prediction accuracy.2.This thesis put forward a new classification algorithm for protein function prediction. The thought of the algorithm is to improve the traditional K nearest neighbor algorithm. Every neighbor has a decision weight whose argument is the similarity between neighbor sequence and unknown sequence.Then combine the improved K nearest neighbor algorithm and the encoding based on grouped weigh method to predicte protein functions. The experimental result shows that the new classification has some advantages, such as simple model, low complexity, high accuracy, and so on.
Keywords/Search Tags:Function prediction, Sequence alignment, Feature extraction, Analysis of similarity, Classification algorithm
PDF Full Text Request
Related items