Font Size: a A A

Research On Haplotype Inference And Imputation Based On Block Partition

Posted on:2012-11-15Degree:MasterType:Thesis
Country:ChinaCandidate:Y WangFull Text:PDF
GTID:2178330338492033Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
As the Human Genome Project and the International HapMap Project were put into practice, the study of genomics, the concept of which was proposed in 1986, was systemized. Large amounts of gene data were generated by biotic experiments, while there are missing alleles exist in these data, and some of them can't be directly used by biologists. Resequence can be done with a great deal of time and money. Instead of that, many computer algorithms are designed to predict the missing alleles and change the data into the one that biologists want. We study two problems about haplotype inference and imputation, the main contributions inlude:1. Haplotype Inference in Population DataStatistical methods can be used as an important way to solve the problem of haplotype inference, but such methods are difficult to accurately solve the problem in large-scale data. In order to deal with the large-scale data, we used the block partition based on multilocus association, which have been proved more effective than pairwise linkage disequilibrium. But multilocus association can be only used in haplotype data before, so we use a strategy, sliding window, to do haplotype inference in the window before the judgement of multilocus association. And we compare our method with several algorithms in the real and simulated data. The results show that the time of our method is closed to most of them, and the rate of accuracy in our method is higher than them. In 5q31, our method EPLEM has lower IER and SER than any methods mentioned in our paper, especially with IER reduced1-9%, and the running time of our method is 7.8s.2. The Imputation based on block partition without reference haplotypeAt present, the methods of imputation mainly rely on the related data of haplotype exist in the database, such as HapMap. But in some case ,we can't find these haplotype, and we have to use the information lied in the test data only. We use the block partition here, and the test data in the block can be devided into two kind of data, intact and missing. The part of intact can be changed into the reference haplotype. We use the simulated data to compare our method with the one proposed by Jung. For the data of weak association, the rate accuracy of our method are improved 1-2%. And 7-10% are improved for the data of solid association.
Keywords/Search Tags:genotype, haplotype, linkage disequilibrium, association of multilocus, imputation, block partition
PDF Full Text Request
Related items