Font Size: a A A

Method For Off-Target Prediction Of CRISPR/Cas9 System Based On Feature Fusion And Anti-Noise Model

Posted on:2022-11-29Degree:MasterType:Thesis
Country:ChinaCandidate:Z R ZhangFull Text:PDF
GTID:2480306776492954Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
The CRISPR/Cas9 gene-editing system is a third-generation gene-editing technology that has been widely used in the biomedical field.However,the off-target effects of the CRISPR/Cas9 system have always been a challenging issue for its practical application.Investigators can quantify the off-target effects of CRISPR/Cas9 systems through various assays.Still,it is not feasible in practice to extend these assays to all g RNAs on a genome-wide scale due to time,cost,and other factors.It is an effective solution to develop a computational method for off-target prediction to improve g RNA screening efficiency.At present,many researchers have proposed models for off-target prediction of CRISPR/Cas9 systems.Although these models have achieved excellent results,there are problems in exploiting g RNA-DNA sequence pair features,dealing with noisy data,and fusing handcrafted features.In this paper,we conduct research around the problems mentioned above.The main contents and innovations are as follows:(1)Aiming at the problem that the current coding schemes and models do not fully express and utilize sequence pair information,new coding schemes and models are proposed.The coding scheme distinguishes different action regions in the g RNA-DNA sequence pair by adding action channels to fully express the sequence pair information.And modify the encoding method to improve the feature sparse problem of the encoded matrix.In the model,the feature fusion module is used to fuse high and low-level features to improve the off-target prediction performance of the model.(2)Aiming at the problem of noisy data in the dataset,an anti-noise CRISPR-IP model is proposed.We assign different reliability to the samples through the results of biochemical experiments,the samples with low reliability are sampled multiple times,and multiple data subsets are constructed by combining with the samples with high reliability to achieve noise smoothing.And use the generalized cross-entropy loss function robust to noise to train the CRISPR-IP model,reduce the impact of noise data on the model,and improve the model's generalization.(3)Aiming at the problem that deep network models are difficult to integrate manual features for off-target prediction,an ensemble model is proposed.The model is based on XGBoost,Light GBM and CRISPR-IP models,which can effectively use handcrafted features and improve the model's prediction performance.It provides research ideas for the off-target prediction of deep learning models by integrating handcrafted features.
Keywords/Search Tags:CRISPR, off-target prediction, coding schemes, neural networks, ensemble learning
PDF Full Text Request
Related items