Font Size: a A A

A Machine Learning Study On The Thermostability Prediction Of (R)-?-Selective Amine Transaminase From Aspergillus Terreus

Posted on:2022-01-08Degree:MasterType:Thesis
Country:ChinaCandidate:L L JiaFull Text:PDF
GTID:2480306734487514Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
Chiral amines are important components of synthetic drugs,fine chemicals and agricultural chemicals.(R)-?-transaminase((R)-?-TA)is a biocatalyst for the synthesis of enantiomer chiral amines,but there are few protein enzymes in nature,and their stability is relatively poor.How to improve the thermostability of(R)-?-TA with high application value is a hot topic in protein engineering research.And it is also an important prerequisite for the preparation and synthesis of chiral amines.Screening better(R)-?-TA mutants by machine learning is a novel approach to improve their thermostability.Based on a library of(R)-?-TA mutants from Aspergillus terreus,an Innovative Sequent-Activity Relationship(ISAR)was used to predict(R)-?-TA with high thermostability.The main work and achievements of this thesis are as follows:(1)For the amino acid sequence and half-life of(R)-?-TA wild-type and mutant,the machine learning algorithm combined with Digital Signal Processing(DSP)and Fast Fourier Transform(FFT)was adopted.Finally,we used Partial Least-Squares Regression(PLSR)to model and predict the(R)-?-TA mutant database.The high-performance model for the thermal stability of the material was obtained,in which the coefficient of determination R~2was 0.8929 and the minimum root mean square error cvRMSE was 4.89.Meanwhile,some new combined mutants with higher half-life were predicted.(2)We propose an improved ISAR method.One method is to combine the(R)-?-TA mutant database with the highest-ranking index.The other is to combine the top-ranked index with the index that is not FFT.By connecting the data and stacking data horizontally,we used PLSR to set the model and predict the(R)-?-TA mutants.It was found the robustness of the model was improved.The best R~2and cvRMSE values for the optimized were 0.9276 and 3.71 respectively.The results show that the improved ISAR method can screen out better models and mutants by using a sequence space machine learning algorithm,providing a powerful solution to the current sequence exploration and combination problems in protein engineering.
Keywords/Search Tags:machine Learning, (R)-?-transaminase, digital signal processing, thermostability
PDF Full Text Request
Related items