Font Size: a A A

Calling Genomic Copy Number Variation Based On Deep Learning

Posted on:2021-02-27Degree:MasterType:Thesis
Country:ChinaCandidate:M S DingFull Text:PDF
GTID:2404330605971632Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development and application of DNA sequencing technique,human is exploring the secret of heredity at the genetic level.Copy Number Variation(CNV)is a important structure variation associated with many complex human diseases,desiring to improve the efficiency of calling CNV in sequencing data turns into requirement.Normal CNV detecting method gets it’s limit on data compatibility and insufficient at efficiency including both accuracy and sensitivity.We propose a novel method based on deep learning to detect CNV.The main contents of our research are described as below:(1)Figure out format of sequencing data and the simulation.Research on the processing of Next Generation Sequencing(NGS)and Third Generation Sequencing(3GS)from sequencing to alignment then detection,comprehend all the data formats of every step and their processing approaches.Investigate the simulate technique of CNV to expand amount of CNV and provide comparison to real date in order to understand features of CNV fully.Those works lay the foundation for detecting and visualize CNV.(2)Visualize the sequencing date.On the basis of researching on sequence data,mining the features of CNV in genetic data and converted into more readable image format as deep learning network input.Combined with the four strategies of detecting structural variation:paired-end mapping,read depth,split-read and assembly,do research on visualization methods aimed to emphasize features of CNV,which preparing data storage for classification on deep learning convolutional neural network model.(3)Research on the integrated detection method of CNV.For the purpose of improving the accuracy and sensitivity of detecting copy number variation,improve the detection sensitivity by integrating the results of multiple detection tools.At the same time,the convolutional neural network model is used to classify the set of candidate mutation sites to eliminate false positive judgments to obtain The final test result.(4)Research on the detection of NGS and 3GS data.The image method is related to all reads in region for the display of sequences and has a higher tolerance for data differences between different platforms.The image method is compatible with SNP and indel detection on different platforms for NGS data,but the test has not been verified on detecting CNV at NGS and 3GS data.The experimental results show that the integrated image-based CNV detecting method is effective for NGS and 3GS data,and is superior to a single existing tool in accuracy and sensitivity.
Keywords/Search Tags:copy number variation, sequencing, image generation, convolutional neural network, integrated detection
PDF Full Text Request
Related items