Font Size: a A A

Information Entropy Method In Classification Of Homooligomeric

Posted on:2007-02-22Degree:MasterType:Thesis
Country:ChinaCandidate:J QuFull Text:PDF
GTID:2120360212957688Subject:Engineering Mechanics
Abstract/Summary:PDF Full Text Request
With HGP entering into post-genome era, how to research and to predict the structure and the function of proteins has played an important role. The structure and the function of proteins may be determined by means of experiment, but it costs much time and may be meet lots of difficulties. So the scientists have being sought the theoretical and computational methods for predicting them.This paper investigates to classification of homooligomeric proteins from primary structure. Main contents of this paper are as follows:Some existing methods for feature extraction method and classification of homooligomeric proteins are introduced. Pseudo amino acid composition and FDOD methods are applied to discriminate between homodimers and non-homodimers. Pseudo amino acid composition keeps the main feature of amino acid composition; on the other hand, it takes into account sequence-order correlation with different ranks, which contain more information beyond the classic amino acid composition. So it is used as a feature extraction method in this text. FDOD method is a function of degree of disagreement which is based on Shannon entropy, so it has internal connection with K-L entropy. The connection is studied. Based on K-L entropy, the method of measuring disagreement for several distributions is improved. FDOD method is applied by not taking into account the subsequence distribution, but augmenting the dimension of distributions which get from pseudo amino acid composition. The classification results are better than those of FDOD which length of subsequence is two.In this present work, a subset database is established. It is randomly selected from the original database and applied to classify. Compared with two results, it is evident that the database size has great influence on the performance of the prediction system. The classifying results may be also influenced by the weighted factor .There is an optimal value of weighted factor to be selected.
Keywords/Search Tags:Bioinformatics, Homodimers, Non-homodimers, Pseudo amino acid composition, FDOD
PDF Full Text Request
Related items