Font Size: a A A

Study On Detection Method Of Copy Number Variations Based On Parent-offspring Trios Gene Sequencing Data

Posted on:2018-06-01Degree:MasterType:Thesis
Country:ChinaCandidate:H C DongFull Text:PDF
GTID:2310330536481716Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The development of next generation sequencing technology,named high-throughput sequencing technology,makes both the cost and time of sequencing getting lower and lower.Meanwhile,the research on detection of copy number variation is an important subject in this field.The copy number variation not only determines the individual differences,but also determines the occurrence of many diseases.How to accurately detect the copy number variation in high-throughput sequencing data is the emphasis and difficulty of the research.The existing methods for the detection of genome copy number variants are emerging in an endless stream,however,the accuracy of these methods are still low and these methods are mainly based on the sequencing data of single sample,which cannot accurately detect genetic copy number variation and newborn copy number variation.The whole genome sequencing data from parent-offspring trios has played a very important role in the study of the types of genetic disease and copy number variation.According to the low accuracy of the existing detection methods of copy number variation which are based on single sample,this paper mainly research on the detection of copy number variations in parent-offspring trios.In this paper,the existing probability model based on reads depth is analyzed and fitted by real sequencing data.Through the comprehensive evaluation of the advantages and disadvantages of each model,the best model will be chosen out for the detection research.By studying the properties of paired-end mapping in real sequencing data,in this paper,a cluster algorithm based on paired-end mapping is designed.Through analysis of the allele frequency information of the SNV locus in the real sequencing data,the model was constructed by using the beta binomial-based distribution to fit the allele frequency information.In this paper,the system of copy number variation detection based on the parent-offspring trios sequencing data is designed and introduced in details.In this system,hidden Markov model was introduced to combine the probability model of reads information and probability model based on allele frequency,and the emissionprobability is calculated by this two probability model,also,the clustering algorithm based on paired-end mapping information is added to the detection system of the post processing.The system supports the simultaneous detection of copy number variation of the parent-offspring trios samples,which is intended to improve the detection performance and accurately detect the genetic copy number variation and the newborn copy number variation.
Keywords/Search Tags:high-throughput sequencing, copy number variation, probability model, parent-offspring trios, detection system
PDF Full Text Request
Related items