Font Size: a A A

Genome-wide Screening Of Multi-allelic SNPs For Forensic Individual Identification And Development Of NGS-SNP Panel

Posted on:2021-03-21Degree:MasterType:Thesis
Country:ChinaCandidate:L N BuFull Text:PDF
GTID:2404330614468651Subject:Forensic medicine
Abstract/Summary:PDF Full Text Request
Objective:Multi-allelic single nucleotide polymorphisms(multi-allelic SNPs)are different from bi-allelic SNPs,containing three or more alleles.There are more than 3 million SNPs in the human genome,most of which are bi-allelic SNPs,and multi-allelic SNPs only account for a small part.Multi-allelic SNP has a number of characteristics that make them ideal markers for forensic human identification.Like bi-allelic SNP,multi-allelic SNP is suitable for highly-degraded DNA samples due to the short amplified fragments,and also can provide much more information such as phenotype,ancestry,etc.Furthermore,the genetic polymorphism of multi-allelic SNP is higher than that of bi-allelic SNP,making it more valuable in personal identification,especially for mixed samples.In addition,SNPs have lower mutation rates than short tandem repeats(STR),which is better for kinship analysis,especially distant kinship analysis.Here,we screened the multi-allelic SNP suitable for Chinese Han population to construct a multiple panel by using next generation sequencing(NGS)technique,and evaluated its forensic application values including accuracy,repeatability,sensitivity and forensic genetic parameters,to provide some new genetic biomarkers and technical strategies for forensic individual identification and kinship analysis.Methods:1. Selection of multi-allelic SNP loci:We selected tri-allelic and tetra-allelic SNPs among the Chinese Beijing Han(CHB)and China South Han(CHS)populations from the Phase III database of the 1000 Genomes Phase III.Multi-allelic SNP locus were selected as following criteria:1)minor allele frequency(MAF)>0.05;2)heterozygosity(Het)>0.65;3)be Hardy-Weinberg Equilibrium;4)unrelated to disease;5)the distance between two adjacent loci is more than 5M.A total of93 multi-allelic SNP locus meet the above criteria,and were further detected by using pyrosequencing technology in the mixed pool selected from 100 Han Chinese population in Hebei province,to verify the loci are real multi-allele in Han population.A total of 66 multi-allelic SNPs were eventually selected.2. Construction of NGS-SNP typing system:The NGS-SNP typing reagent was synthesized by molecular barcoding and single-end specific primer extension to construct a panel for 66 selected multi-allelic SNPs.The amount of the starting DNA template was 20ng.The quality and concentration of fragments were quantified after the library construction.And then the libraries underwent a series of processes including volume mixing,dilution and denaturation.Finally,the sequencing was performed by using the RUO(Research Use Only Run)mode of the Miseq FGxTM platform.QIAGEN’s official website program was applied to deal with the raw data(http://www.qiagen.com)to obtain the data information of mutation sites and the corresponding BAM files of disconnection data.3.Evaluation of Forensic application values of the NGS-SNP typing system:The accuracy,repeatability,sensitivity and the survey of population genetics of this typing system were evaluated.The samples of the population genetics were derived from 64 unrelated individuals of Hebei Han individuals.Results:1.Selection of multi-allelic SNPSThrough the criteria of selection listed above,a total of 66 multi-allele SNPs were screened,distributing in 21 autosomes,with no more than 5 SNPs loci per chromosome.2. Laboratory evaluation of NGS-SNP typing systemAllele coverage depth of 20×was set as a threshold before analyzing the original data.The average sequencing depth of 64 samples was 77955×.The mean Doc of loci is 1191×;%Allele and%Noise were 93.20%and 6.80%,respectively.The mean Hb of 66 loci was 0.726.3. Forensic Application Evaluation of NGS-SNP Typing System(1)AccuracyPyrosequencing and Sanger sequencing were used to verify the NGS typing of the positive control sample of 2800M.Eighteen out of 66 SNP loci were not verified by pyrosequencing or Sanger sequencing due to failure in primer design.Eventually,a total of 48 multi-allelic SNPs were detected by pyrosequencing or Sanger sequencing.The results showed that NGS is completely consistent with Pyrosequencing or Sanger sequencing.(2)RepeatabilityThree libraries with the same starting template amounts of 2800M Control DNA were sequenced simultaneously on the MiSeq FGxTM.The results showed that the genotyping of the three libraries was consistent except rs72845206 and rs71277146.After statistical tests,there was no significant difference between%Allele(P=0.137)and Hb(P=0.647).(3)SensitivitySeries of input DNA templates(10ng,5ng,2ng,1ng,0.5ng,0.25ng)were detected for sensitivity evaluation.When the initial template amounts of the library decreased,the heterozygosity balance decreased and the coefficient of variation increased.Allele insertions and deletions occurred with the starting template amounts of DNA 2ng or less.Therefore,in this study,we recommend the template quantity had better to be larger than 2ng,and5ng~10ng might be the optional initial template amount.(4)Population genetics parametersPopulation genetic parameter analysis was performed on detecting 64samples of unrelated individuals from the Han nationality in Hebei.Four loci with poor performance in forensic evaluation were excluded.Among the rest62 loci,no deviation with Hardy-Weinberg equilibrium were found for 56 loci,while 6 loci were not in Hardy-Weinberg equilibrium even after Bonferroni correction(P<0.05/62=0.0008065).All of the 62 loci were in linkage equilibrium.The matching probability of 56 loci was 1.05-35,the CPED value was 0.999986,and the CPET value was 0.9999999995.Conclusions:In this study,we selected 66 multi-allelic SNPs from human genome to construct NGS-SNP typing system.Except for rs201255836,rs71277146,rs72845206,and rs648431,the remaining 56 multi-allelic SNPs in the NGS-SNP panel performed well with relatively high accuracy,repeatability and sensitivity.In Hebei Han population,matching probability of 56 loci is1.05-35.CPED value is equal to 0.999986 and CPET value is0.9999999995,which is enough for forensic individual identification and paternity test.
Keywords/Search Tags:Next generation sequencing, Multi-allelic SNP, Individual identification, Kinship identification
PDF Full Text Request
Related items