Font Size: a A A

Bioinformatics Studies On The Prediction Of Amphipathic Helical Region In Alpha Helical Transmembrane Protein Structures

Posted on:2017-05-03Degree:MasterType:Thesis
Country:ChinaCandidate:F Q GaoFull Text:PDF
GTID:2370330590491509Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
A transmembrane protein is a type of membrane protein.It spans the entirety of the biological membrane to which it is permanently attached.Transmembrane proteins span from one side of a membrane through to the other side of the membrane.There are two basic types of transmembrane proteins: alpha-helical protein and beta-barrels protein.Accurate bioinformatics prediction of their topological structures is particularly important for helping understand their functions due to the extreme difficulties of wet-lab structural studies of membrane proteins.Much success has been witnessed in the prediction of transmembrane segments in the past decades.However,few predictors are available for accurate prediction of the amphipathic helices(AHs),which are demonstrated play important roles in lipid binding,sensing membrane curvature,formation of membrane tubule and interaction with other proteins.Here,we presented a new method which is a statistical machine learning-based approach to predict the location of amphipathic helices from the amino acid sequence.In order to improve the accuracy of amphipathic helices' prediction,based on position specific scoring matrix(PSSM)of protein,protein second structure,Z coordinate,hydrophobic moment and helix periodicity a novel feature which is proposed by us.We construct a relative large benchmark data sets about AHs.Three methods are proposed to weaken the effect of extreme imbalance in our training data: cutting off transmembrane segments using MemBrain,under-sampling and classifier ensemble.Three ensemble models are trained after feature selection.Compared with the other methods,our ensemble algorithm performs better discrimination.Ensemble classifier remits the damage of under-sampling and improves the performance.Experimental results show that our method is much better than others which predicts AHs region on benchmark data sets.Our model is capable of achieving 40% accuracy on a benchmark dataset composed by119 AH segments from 72 TMH protein sequences,which is about 25-35 percentage higher than the traditional hydrophobic moment-based detection method and other existing predictors.
Keywords/Search Tags:Bioinformatics, Transmembrane Protein, Amphipathic Helix, Machine Learning, Ensemble Classifier
PDF Full Text Request
Related items