Font Size: a A A

De Novo Mutation Detection Method Based On High-Throughput Sequencing Data

Posted on:2018-08-27Degree:MasterType:Thesis
Country:ChinaCandidate:W H XingFull Text:PDF
GTID:2310330536481919Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the continuous development of sequencing technology,people can get genetic data at lower cost and at higher speed.The era of individual genome has come.The acquisition and storage of genome data is no longer a problem.The problem we need to solve is how to mine valuable information from large amount of data,especially the variant data related to human health.De novo mutation is an important part of variant and an important factor leading to many genetic diseases and psychiatric disorders.The number of de novo mutation is small,there're lots of difficulties to detect and filter them.Many tools and methods have been proposed to detect de novo mutation,and despite the success,there are great number of deficiencies.Based on high-throughput sequencing data,we mainly do research to detect and filter de novo indels in this paper,and the main contents are as follows:(1)In this paper,we apply the Adaboost method to the de novo indel detection,and we select 55 features related to de novo indel detection to extract the feature data set.We use local de novo alignment in model and realize DNINDELFilter.We use real de novo indel data sets which have been validated biological experiments and CEU trio data sets to train and adjust this model.Our model has high classification credibility and overcome the disadvantage of the high false positive rate in the existing methods,at the same time maintains a high true positive rate.(2)We develop a lightweight visualization tool and apply visualization method to detect de novo indels.This tool greatly simplifies the operation and focus on the area covering the candidate sites to extract the read features of de novo indels which can be used in visualization.This tool can help to filter de novo indels and help to discover new features can be used to detect de novo indels.
Keywords/Search Tags:de novo mutation, high-throughput sequencing data, Adaboost, genome visualization
PDF Full Text Request
Related items