Font Size: a A A

Study On The Methodology Of Genomic Prediction By Integrating Biological Knowledge

Posted on:2019-06-17Degree:DoctorType:Dissertation
Country:ChinaCandidate:N GaoFull Text:PDF
GTID:1360330563985032Subject:Genetics
Abstract/Summary:PDF Full Text Request
Today,genomic selection(GS)has being regarded as a promising technology for plant and animal breeding programs.However,current standard methods are purely based on statistical considerations but make no use of the biological knowledge,which is easily available in public data bases.Nowadays,abandunt of knowledge about the genomic position of protein coding genes,important genomic regions contain QTLs,and gene interaction networks had been accumulated.Previous studies found that genomic selection can be improved by incorporating known biological knowledge.The major questions that have to be answered before biological prior information being used routinely in GS approaches are which type of information can be used and at which point it can be incorporated in prediction methods.In this study,we 1)incorpotated gene annotation into genomic selection by constructing haplotypes according to gene postion and calculating genetic relationship between individuals with those genic haplotypes;2)measured the similarity between haplotype alleles and translated the haplotype similarity into individual similarity,which was unsed in genomic selection;3)restricted gene inractions to the interactions indicated by KEGG pathways and converted the gene interactions into kernels,which was used to evaluated the predictive ability of corresponding pathways.We tested the new models in a mice population,a yellow chicken population,the DGRP population,an Arabidopsis population,and a rice breeding population.Based on the results from these datasets,we can draw the following conclusions.1)We proposed a novel strategy to incorporate genome annotation into genomic prediction by defining haploblocks according to gene positions.Results show that genome annotation based models outperform naive genomic best linear unbiased prediction(GBLUP)models with most of the traits.Compared to SNP based categorical models,using genome annotation to define haplotypes as predictors leads to a higher – at least comparable – predictive ability in many instances.The advantages of haplotype based models over SNP models are shown by restricting haploblocks to annotated functional genomic units.Modeling gene interaction effects additionally improves prediction further.The performance of the new models is influenced by genome annotation quality,marker density,and linkage disequilibrium(LD)decay.2)Genic haplotype allele similarity was measured through different approaches and converted into individual similarity,which was then treated as a special kernel and used in kernel based genomic selection models.Results show that our new method improves the predictive ability in some traits,especially in the Arabidopsis dataset and the rice breeding population.The newly proposed strategy shows better performance in respect of predictive ability compared to SNP models and haplotype models without gene annotation in several traits,but more studies need to be conducted to investigate the characteristics of this model.3)By mapping SNP to protein coding genes included in KEGG pathways and constructing Gaussian kernels with those pathsway SNPs,gene interactions are restriced into interations indicated by the KEGG pathways.The predictive ability of those Gaussian kernels built on reduced interactions were evaluated in several populations.Results show that genomic predictive ability can be improved by incorporation KEGG pathway into GS models.However,the way of gene interacting with each other and methods to translate these interactions into genomic selection models need more careful investigations inorder to gain more improvement.
Keywords/Search Tags:genomic selection, biological knowledge, genome annotation, KEGG pathways, haplotype models
PDF Full Text Request
Related items