Font Size: a A A

Estimation of the genetic relatedness of two individuals using genotype data and its application towards large, population-based datasets

Posted on:2014-03-07Degree:Ph.DType:Dissertation
University:The Johns Hopkins UniversityCandidate:Stevens, Eric LFull Text:PDF
GTID:1458390005488652Subject:Biology
Abstract/Summary:
Background: Single nucleotide polymorphisms (SNPs) and other chromosomal variants have been used to estimate the genetic relatedness of a pair of individuals. Past methods have relied on the use of allele frequencies, hidden Markov models (HMMs), or haplotype frequencies to obtain probabilities of regions shared identical-by-descent (IBD). We theorized that SNPs could be used without the need for explicitly modeling allele frequencies to estimate the percent of the genome shared IBD between any two individuals.;Method: We took a published method (Lee W. 2003) designed to distinguish between related and unrelated pairs of individuals using a genome-wide ratio based on informative identity-by-state (IBS) comparisons [IBS2* / (IBS2* + IBSO)] between two individuals and modified it to calculate similar ratios in a sliding window along each chromosome. The (resulting) freely available program, kcoeff, estimates the percent of the genome shared IBDO (k0), IBD1 (kl), and IBD2 (k2) [Cotterman coefficients of relatedness] between any two individuals from any geographic population. The degree and type of relationship can be manually inferred according to suggested guidelines and additional pedigree reconstruction methods that we explain in detail.;Results and Conclusions: IBD estimates generated by kcoeff performed well against those generated by other software, and, more importantly, centered around the expected Cotterman coefficients of relatedness values for regular (e.g. non-inbred) relationships. We were able to report unannotated relationships within several large populations whose relationships were considered known. Additionally, atypical relationships involving bilineal relatedness (e.g. double-first cousins) and consanguinity (e.g. inbreeding) were able to be identified based on atypical IBD values and homozygosity analysis. Finally, kcoeff and pedigree reconstruction methods are important for confirming relationships in clinical pedigrees in which the researcher can find linkage to one or more loci to explain a disease or for better annotating datasets in which related individuals would want to be removed in order to decrease bias.;Overall, kcoeff provides a way to estimate how related two individuals are that can be used to reconstruct regular relationships or identify the presence of atypical relatedness (and its meaning) in conjunction with pedigree reconstruction methods.
Keywords/Search Tags:Relatedness, Two individuals, Pedigree reconstruction methods, Relationships, IBD
Related items