Font Size: a A A

The Prediction Of Off-target Effects And On-target Efficiency For CRISPR/Cas9 System Based On Machine Learning

Posted on:2021-05-23Degree:MasterType:Thesis
Country:ChinaCandidate:H B XuFull Text:PDF
GTID:2370330647461938Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As a new generation of site-directed gene editing technology,CRISPR/Cas9 has been widely used in the field of gene editing due to its low cost and simple operation.At present,the two major problems facing the CRISPR/Cas9 system are how to minimize the off-target effects of CRISPR/Cas9 and maximize the on-target efficiency of CRISPR/Cas9 sg RNA on the target DNA.Accurate prediction of these two problems can provide guidance for the practical application of CRISPR/Cas9.The rapid development of machine learning provides new ideas and methods to solve these two problems.This paper conducts in depth research on the off-target effect and on-target efficiency prediction of CRISPR/Cas9 system and provides two new methods.The main research contents are as follows:(1)The off-target effects of CRISPR/Cas9 refers to that CRISPR/Cas9 incorrectly targets to non-target DNA sites,resulting in wrong cleavage.In order to quickly and accurately predict the off-target effect of CRISPR/Cas9.With the idea of ensemble learning as the core,this paper proposes a new encoding method,integrates four existing CRISPR/Cas9 miss scores,and constructs a predictor using XGBoost model,achieving good performance on the training set and independent test set,which is significantly improved compared with the existing tools.(2)The on-target efficiency of CRISPR/Cas9 refers to the efficiency of targeted knockout of targeted DNA by CRISPR/Cas9 with specific sg RNA.CRISPR/Cas9 identified potential target sites by identifying PAM sequences,and induced Cas9 nucleases to function through base complementary pairing between sg RNA and target sites.However,for different targeted sites,the efficiency of CRISPR/Cas9 varies greatly.In this paper,two sequential encoding methods are compared,and a CNN-XGBoost model is constructed to predict the on-target efficiency of CRISPR/Cas9 with good performance on the dataset by combining the integration idea with a variety of existing tools.
Keywords/Search Tags:CRISPR/Cas9, off-target effects, on-target efficiency, XGBoost, CNN
PDF Full Text Request
Related items