Font Size: a A A

Study On The Intelligent Design Of Escherichia Coli Terminators

Posted on:2024-08-27Degree:MasterType:Thesis
Country:ChinaCandidate:J LiFull Text:PDF
GTID:2530307160976449Subject:Bioinformatics
Abstract/Summary:PDF Full Text Request
A terminator is a specific nucleotide sequence located at the 3’end of a gene and containing transcriptional termination information.As a fundamental gene regulatory element,terminator is an important component in the design of gene circuits.Accurately characterizing their strength is of great significance for improving the accuracy of gene circuit design.The method of characterizing the strength of terminators through experiments is time-consuming and labor-intensive,so it is necessary to develop accurate computation tools for predicting the strength of terminators.The current prediction methods do not fully consider the sequence information or thermodynamic information of terminators,and there is a lack of accurate models for predicting the strength of terminators.At the same time,the deep generative model has shown great potential in biological sequence design and is expected to be used in the design of terminators.The intelligent design of E.coli terminators was studied in this work,mainly as follows:(1)This study analyzed the distribution of natural terminators in the genome of E.coli.It is found that the higher the termination efficiency,the higher the expression of upstream genes.For highly expressed genes,they are generally terminated by terminators with high termination efficiency to prevent the impact of transcription readthrough on the expression of themselves and their downstream genes.(2)In order to build a prediction model for the strength of intrinsic terminators in E.coli,this study extracted the sequence features and thermodynamic features of terminators,and used three methods of F-test,random forest,and mutual information to select features.Then,the selected features were used in six machine learning methods,including support vector regression,decision tree regression,random forest regression,extreme gradient boosting algorithm,K-nearest neighbor algorithm and ridge regression,and the final prediction results showed that the support vector regression model had the best prediction performance,with an R~2of 0.7204.(3)This study utilized generative adversarial networks to learn from the training sets of intrinsic terminators to generate artitfical terminators.After evaluation,it was found that the data distribution of the generated terminators and the intrinsic terminators was similar,indicating the reliability of generative adversarial networks in generating terminators.Since terminators with high termination efficiency play an important role in gene circuit design,this study used the termination strength prediction model constructed above to select them from the generated terminators.In experimental verification,72%of the 18 selected terminators had termination efficiency greater than 90%,indicating the effectiveness of the intelligent design of the E.coli terminators in this study.In conclusion,this study analyzed the distribution of natural terminators on the E.coli genome,and constructed a strength prediction model and a generative model for the E.coli terminators.This work provides model support for the design of terminators in gene circuits,enhances the design modularity of biological part,and promotes the development of synthetic biology.
Keywords/Search Tags:Terminator, Biological part design, Generative adversarial networks, Termination efficiency, Machine learning
PDF Full Text Request
Related items