Font Size: a A A

Developing Methods And Software For Genetic Analysis And Their Applications

Posted on:2013-08-08Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z H ZhuFull Text:PDF
GTID:1220330395993617Subject:Crop Science
Abstract/Summary:PDF Full Text Request
Most agronomically important traits of crops are quantitative in nature, and their genetic variations are usually controlled by a set of genes, called quantitative trait loci (QTLs). There exist genes by genes interaction and genes by environments interaction. Since the genetic architecture of these traits is so complex, the trait variation is not only due to individual QTLs and their interactions, but also to the network of SNPs, RNAs, proteins, and metablites. In order to study the mechanism of complex traits, plant breeders and researchers adopt more complicated experimental design or natural population, and collect a series of correlated phenotype data. In addition, owing to advanced high-throughput biological technologies, it is convenient to acquire large-scale biological data. For instance, SNPs (Single Nucleotide Polymorphisms) become wildly used in the field of crop science, instead of conventional molecular markers, such as SSRs (Simple Sequence Repeats), AFLPs (Amplified Fragment Length Polymorphisms), RFLPs (Restriction Fragment Length Polymorphisms) and etc. Similarly, gene-expression microarrays have been combined with some other experimental approaches to find the genetic mechanism of complex traits. They bring challenges to the biostatistical methods.We have developed new statistical methodologies and theories to settle the issues which current exist in genetics and plant breeding. We also investigate the efficiency and effectiveness of the proposed statistical methods by Monte Carlo simulations and real data analysis. The main contents of the dissertation are as follows,1. The first chapter introduces the recent issues exist in some corresponding statistical methods. The main motivation of the dissertation is to provide some solutions to the genetic analysis. Then, we briefly introduce several recent statistical methods on hypothesis test and genetic effect estimation.2. The second chapter expands the conventional linkage analysis, considering the challenges in recent applications. The chapter is divided into two sections.1. The first section introduces the statistical methods of multi-trait mapping for quantitative trait loci (QTLs). Most approaches can only do QTL mapping separately by a phenotype-driven method focusing on an individual trait. However, complex trait data usually contain observations on multiple traits and in multiple environments or under different treatment conditions. Therefore, several potential problems arise; for instance, determining whether multiple traits are affected by a single QTL with pleiotropic effects, or by multiple closely linked QTLs. We proposed a multivariate statistical model to conduct multiple-trait analysis, using Wilks’ Lambda statistic to test the individual loci and epistatic interaction. Monte Carlo simulations and real data analysis are conducted to demonstrate the applicableness and powerfulness of the methods.2. The second section introduces a statistical strategy to mapping triplet interactions. The epistasis, interaction between loci or genes, indicates that the effect of a specific genotype combination on the genotype depends on the genetic background. A lot of works provide the evidence of significance of epistasis. In addition, the epistasis is reported to be a potential key driver of missing heritability. Identification of interactions between genes is able to improve the genetic predition, which is contributive to the disease-risk classification and plant breeding. In current, however, the mainly statistical methods for searching QTLs could not detect triplet interactions. Thus, we proposed a mapping strategy which is based on mixed linear model approach to search high order interactions. Monte Carlo simulations are performed to investigate the effectiveness and efficiency of the proposed methods.3. The thrid chapter introduces the newly proposed statistical methods on genome-wise association study (GWAS) to analyze the high-thro ugh put biological data. The chapter is also divided into two sections.1. The first section introduces the association analysis of quantitative trait SNPs (QTS). Owing to high-throughput genotyping technologies, simultaneous comparison of groups of loci, and density of SNPs, SNP markers are widely used in biomedical, plant and animal researches. Meanwhile, association analysis becomes a common tool to handle the large-scale data. Due to deficience of the stastitical model, we proposed a mixed linear model appraoch based assocation analysis of SNPs. It could detect individual and epistatic interacting loci, as well as estimate their main effects and loci by environment interaction effects. We also perform Monte Carlo simulations and real data analysis to demonstrate the applicableness and efficiency of the methods.2. The second section introduces the association analysis of quantitative trait transcript (QTT), quantitative trait protein (QTP), and quantitative trait metabolite (QTM). The gene-expression, protein and metabolite profiles have been applied to a wild range of biology problems. It is a kind of tool for biomarkers identification, diseases diagnosis, drug screening, and plant breeding. On the other hand, the profiles of gene-expression level, protein level and metabolite level are recently combined with other experimental approaches to identify the key mechanisms of complex traits. Contrary to the expression quantitative trait loci (eQTLs) and protein quantitative trait loci (pQTL) mapping, we proposed association analysis of transcripts, proteins and metabolites, which considers the profiles data as a type of markers to study the association with an organismal quantitative trait. A series of Monte Carlo simulations are conducted to investigate the effectiveness and efficieness of the statistical appraoch. In the real case study, we integrated the QTL linkage analysis, QTS and QTT association analysis into a mapping system to elucidate the potential drivers of complex traits. Besides, it is also a trial to investigate genetic mechanism of complex traits in detailed based on a series of high-throughput data, such as SNPs, CNVs (copy number variations), gene-expression microarray, protein profiles, and metabolite profiles.4. The fourth chapter introduces the newly developed software, which is mainly focus on statistical analysis-QTXNetwork. The software package is developed by C++to map the QTL, QTS, QTT, QTP, and QTM for complex traits, which could handle data from multiple-environment traits (METs). It could perform1) casual loci detection, including individual, two-and three-way loci,2) effect and heritability estimation of significant loci, including individual,2-and3-way effects, as well as interaction effects and heritability between the loci and environments,3) superior genotype effects prediction, predicting the best genetic effect of an individual in a specific environment based on known loci genotype,4) individual genotype effects prediction, predicting genetic effects of every individual in a specific environment based on loci detected, and listing the top and bottom ones.
Keywords/Search Tags:complex traits, association analysis, linkage analysis, mixed linear model approach, individual loci, epistasis, loci by environment interaction, Monte Carlo simulation
PDF Full Text Request
Related items