Font Size: a A A

Research Of Prediction Of Protein-aptamer And Judgement Of Aptamers Based On Bioinformatics

Posted on:2021-05-16Degree:MasterType:Thesis
Country:ChinaCandidate:X Y MaFull Text:PDF
GTID:2480306560953569Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Nucleic acid aptamers are nucleotide chains that have a high affinity for a target and are about 30nt-80 nt in length.Compared with ordinary DNA / RNA,nucleic acid aptamers have the advantages of easy synthesis,high affinity and specificity.Protein is an important component of organism cells,and it is also the main bearer of life activities.The interaction between aptamers and target proteins exists widely in the human body and plays an important role in various life activities.Fast and accurate prediction of protein-aptamer interactions can help predict protein function and explore the pathogenesis of human complex diseases at the molecular level,and provide important theoretical support for disease diagnosis,treatment and related drug development.Prediction of nucleic acid aptamers with proteins as a target is a hot spot in the field of basic and applied research related to proteins.Through traditional experimental methods to judge that protein-aptamers have interactions,the experimental period is long,the cost is high,and it cannot be carried out on a large scale.With the rapid development of high-throughput sequencing technology,the measured protein sequence data has increased geometrically,and the disadvantages of the experimental method have become particularly prominent.There is an urgent need to develop bioinformatics-based calculation methods to efficiently and accurately predict protein-aptamer interaction and judge nucleic acid aptamers.This paper uses the methods of bioinformatics to study the above two problems and proposes solutions.The dataset processing,feature space construction and algorithm selection and improvement were optimized to achieve a better performance of protein-aptamer prediction model and aptamer judgment model.And the PPAI online service platform was designed and implemented to provide information query and the above two functions.First,based on an in-depth analysis of existing protein-aptamer interaction prediction methods,this paper uses bioinformatics methods to optimize and design a series of problems in predicting protein-aptamer interactions.The SMOTE algorithm was used to perform data balance preprocessing on the unbalanced data set before prediction.Through in-depth analysis of the sequence and structural characteristics of proteins and aptamers,an extraction strategy based on multiple features was proposed to extract a series of key features related to physical and chemical properties.An algorithm combining Adaboost and random forest was proposed for the first time.Comparing the model in this paper with other prediction models of protein-aptamer interactions,the experimental results show that the model proposed in this paper is superior to existing models in terms of prediction accuracy and algorithm complexity.Secondly,the machine learning methods commonly used in the judgement of aptamers were studied in depth,and it was found that there were defects such as low accuracy and difficulty in adjusting the judgement threshold.Based on the full analysis of the secondary structure characteristics of the nucleic acid sequence,this paper applies the combination of Adaboost and random forest for protein-aptamer interaction prediction to the judgement of aptamers for the first time.The final experimental results show that the method proposed in this paper has higher prediction accuracy and easier adjustment of judgement threshold than other commonly used machine learning methods.In order to facilitate the in-depth research of protein-targeted aptamers,this paper designs and implements an online service platform—PPAI(http://39.96.85.9/PPAI)for protein-aptamer interaction prediction,providing users with efficient and accurate prediction of protein-aptamer interaction and judgement of aptamers.In addition,the platform also provides protein-aptamer information query function,which is convenient for researchers to understand the related information of protein and nucleic acid aptamer.
Keywords/Search Tags:protein, nucleic acid aptamer, protein-aptamer interaction prediction, Adaboost, random forest, class imbalance problem
PDF Full Text Request
Related items