Font Size: a A A

The Prediction Of Protein Structural Class And The Statistic Analysis Of The Correlations Between Different Amino Acids On Protein Sequence

Posted on:2006-10-11Degree:MasterType:Thesis
Country:ChinaCandidate:K LiFull Text:PDF
GTID:2120360155976531Subject:Biophysics
Abstract/Summary:PDF Full Text Request
In the first part of the thesis, based on the concept that the structural class of a protein is mainly determined by its secondary structure sequence, the structural class of a protein is predicted by using of the least diversity increment. The secondary structure numbers Nα,Nβ,Nβαβ,N(βαβ) are used as the parameters of states. The threedatabases have been used: the database I (part I ) includes 2616 proteins, homology dose not be considered, and the average rates of correct prediction and specificities are more than 94%, and 91% respectively; the database II (part II) include 663 proteins that are selected from database I, less than 4 proteins belong to the same family and the resolution of X-ray diffraction is higher than 3.0 angstroms, the rates of correct prediction are more than 84%; the databaseIII(partIII) is selected according to following steps: (1) names of structural domains of proteins are obtained from astral 1.65; (2) By use of the proteins structural classification of SCOP, the 1753 domains with less than 40% identical sequence whose resolution of X-ray diffraction is higher than 3.0 angstroms are selected. In addition, the domains with less than 30%(1505 domains), 20%(1305 domains) and 10%(1218 domains) identicalsequence are used, respectively. The overall rates of correct prediction are 79% 78% 78% and 78%, respectively. In the second part of the thesis, the correlations between 20 amino acids on protein sequences by using of databaselll with less than 40%(1753 domains) identical sequence are calculated. The result shows that the correlations between 20 amino acids are different for different protein structural class. Based on the principal component analysis of the correlations between amino acids, the structural classes of the proteins are predicted by using of the least diversity increment. The average rate of correct prediction is more than 66%.
Keywords/Search Tags:structural class, secondary structure sequence, measure of diversity, increment of diversity, correlation between amino acids, principal component analysis
PDF Full Text Request
Related items