Font Size: a A A

Protein Quaternary Structure Prediction

Posted on:2021-01-30Degree:MasterType:Thesis
Country:ChinaCandidate:W J ChenFull Text:PDF
GTID:2370330602469785Subject:Statistics
Abstract/Summary:PDF Full Text Request
The emergence of bioinformatics has greatly promoted the development of life science related disciplines and provided corresponding solutions for the growing biological related massive data in recent years.The study of protein structure is a hot topic in bioinformatics.By studying the structure of protein,we can better understand the internal mechanism of protein and the mystery of life activities.Protein structure is mainly divided into four levels,of which the quaternary structure is of great significance for protein macromolecules.In recent years,relevant scholars have done a lot of work on the prediction of protein fourth level structure,but the prediction rate is still not very ideal.In this paper,different feature extraction methods and machine learning algorithms are used to integrate,and different predictors are constructed in the first layer,monomer,hetero-oligomer,homo-oligomer are identified in advance.After that,we can predict that they belong to several polymers through the second layer predictor.The main work of this paper is as follows:(1)In this paper,an improved pseudo amino acid composition method is constructed for feature extraction.Combined with the nearest neighbor algorithm,compared with the traditional pseudo amino acid composition method,the overall prediction rate is increased by 12.63%,reaching 67.81% and the part of categories in the second layer is increased by 20%.(2)We use the gene ontology database and the existing structural functional domain database to extract features,and use the random forest method to classify.We find that compared with the previous research,the overall prediction rate of this method can reach 74.38%,which is improved by 3.24%.(3)At the same time,we use the traditional neural network and convolutional neural network to fuse the quaternary structure prediction with the latest deep learning network,and find that the neural network performs well in some data prediction with small sample size.In this paper,four different methods are used to analyze the quaternary structure.The results show that compared with the traditional methods,the quaternary structure has a certain improvement,and achieves better expected results.
Keywords/Search Tags:Protein quaternary structure, Gene Ontology, BP, CNN, feature extraction
PDF Full Text Request
Related items