| Synthetic lethality(SL)refers to the phenomenon when the simultaneous expression defects of two or more non-essential genes lead to cell death,but the individual defects of these genes do not.SL genes have broad application prospects in cancer medicine,synthetic biology,and evolutionary biology.Although there has been a considerable amount of basic research on SL genes and the construction of large primary databases,there is a lack of secondary databases focusing on microbial SL genes.Currently,the methods for predicting SL genes mainly focus on using multi-omics feature data to predict SL gene pairs related to cancer and drug targets,providing new ideas for anti-cancer drug development.This thesis aims to establish a secondary database of microbial SL genes,including related data on synthetic rescue genes.This database is a professional database containing multiple microbial SL gene interaction information.Each gene interaction includes functional annotation,protein sequence,experimental measurement methods and cell culture conditions,and corresponding literature data.Currently,the database has collected a total of 16,307 data points from nine microbial strains,including 12,165 SL gene pairs,1,148 SL gene triplets,and 2,994 synthetic rescue gene pairs.In addition,86,981 putative SL data obtained by the homologous transfer method have also been displayed on the website.The website of the database implements search,browsing,visualization,and sequence alignment functions.In this thesis,protein-protein interaction network topological parameters,gene expression data,single-gene essentiality,sequence alignment similarity and other feature information were used to predict potential SL pairs in Saccharomyces cerevisiae.By using five-fold cross-validation,the AUC(area under the ROC curve)index predicted by the Support Vector Machine(SVM)model is 0.7,while the AUC index of the decision tree method can reach 0.78.Although the performance of the prediction model in this article is moderate,it is expected to provide new methods and ideas for predicting SL genes related to cancer genes. |