Font Size: a A A

Prediction Of Essential Genes And Synthetic Lethal Genes In Yeast

Posted on:2022-07-13Degree:MasterType:Thesis
Country:ChinaCandidate:S X WangFull Text:PDF
GTID:2480306524982369Subject:Biophysics
Abstract/Summary:PDF Full Text Request
Essential genes are genes that paly an import role in the survival and reproduction of species,and play an important role in deciphering the survival mechanism of life.With the development of machine learning and bioinformatics,researchers have achieved good results in the prediction of essential genes for humans and bacteria,but the prediction of essential genes for yeast has not reached the high accuracy rate of humans and bacteria.This paper proposes a feature extraction method for codon-specific virtual oligonucleotides adjacent and space-related sequences,and assumes that there is a correlation between nucleotides separated by a certain distance.This correlation is exactly the relationship that exists in the tertiary structure of chromosomes.This paper uses the above-mentioned feature extraction method to train the traditional machine learning model linear SVM to predict the essential genes of yeast,and then uses the SVMRFE+CBR method for feature selection,when k is 6 and ? is,1,800 features were selected out of 16,384 for model training.The grid search method was used to optimize the model parameters,and finally the 5-fold cross-validation was used to obtain an AUC of 0.944.Finally,the RSG method and Seringhaus' s research were compared,and the model performed well on multiple indicators.Synthetic lethality is an interaction between genes.Based on synthetic lethal interaction,it can be used for the screening of drug targets in cancer treatment.In order to obtain more synthetic lethal genes,this paper uses the yeast gene pairs with synthetic lethal interaction and the gene pairs without synthetic lethal interaction obtained by using SGA technology as the positive and negative samples.In order to achieve the best prediction performance of the model,we first select the most suitable parameter k is 1-7and ? is 0-6.We finally select linear SVM by comparing different models.Trying to perform feature selection,the performance of the model has not been significantly improved.Finally,the parameters in the model are optimized.On the independent test set,when the parameter C is selected 3,an AUC of 0.850 is achieved.After that,an attempt was made on the deep learning model,but no ideal prediction performance was achieved.
Keywords/Search Tags:essential gene prediction, machine learning, lethal gene prediction, interspaced nucleotide association, deep learning
PDF Full Text Request
Related items