Font Size: a A A

Research On Haplotype Inference Algorithm Based On Segmentation

Posted on:2016-07-11Degree:MasterType:Thesis
Country:ChinaCandidate:M X ZhaoFull Text:PDF
GTID:2180330470978511Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
With the completion of the Human Genome Project, we found that small changes in gene will cause huge differences in the individual or species traits. In the nature, most of the species are diploid, biological experiments obtains genotype data, which is a combination of the two haplotypes. But, haplotype is a fragment of genetic inheritance, and inter-species similarity as well as particular disease research needs to make a compare between haplotype sequence of normal human haplotype sequences and diseased human haplotype sequences, therefore, haplotype inference algorithms plays an important role in the research of gene diversity and genetic diseases.Currently, the most mainstream of haplotype inference algorithms are Clark algorithm, EM algorithm, Hapar algorithm and Phase algorithm. In this theses, the block and the Hamming distance ideological strategy are introduced in Hapar algorithm and Phase algorithm, thereby both the accuracy and speed of the derivation algorithm are improved. The main contents are as follows:(1) Data preprocessing. Most of the data obtained in the experiment are different from the desired input file format of haplotypes inference algorithms designed. Therefore for each of haplotype inference algorithms, genotype data are required to be preprocessed in accordance with the input file format.(2) Hapar algorithm introduced block strategy is Proposed. If the size of SNP sequence in genotype data is very long, it will consume a lot of time in running with Hapar algorithm, even stop working. blocking is applied to Hapar inference algorithm. The genotype is divided into small segments SNP sequence, in order to guarantee the Hapar algorithm running normally and saving the running time.(3) Phase haplotype inference algorithms introduced block strategy and Max Hamming distance are proposed. If the number of heterozygous loci in genotype data is more and more, Phase haplotype inference algorithm can’t infer all of the heterozygous loci. Therefore max Hamming distance strategy is joined in Phase haplotype inference algorithm to improve the accuracy of this algorithm. At the same time, blocking strategy is joined in Phase haplotype inference algorithm to reduce running time of Phase haplotype inference algorithm.
Keywords/Search Tags:Haplotypes, Genotypes, Haplotype Inference Algorithm, Hapar, Phase
PDF Full Text Request
Related items