Font Size: a A A

A Preliminary Study On Screening Suitable Forensic Y-STR Locus Combination For The Target Population Based On Shannon's Equivocation

Posted on:2021-05-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y S ZhouFull Text:PDF
GTID:1364330605458338Subject:Forensic medicine
Abstract/Summary:PDF Full Text Request
Background and ObjectiveOver the past almost three decades,Y chromosome short tandem repeat(Y-STR)typing had developed into a very mature and indispensable auxiliary technical method in criminal case investigation as the supplementary of autosomal chromosome STR genotyping which was widely-used in forensic DNA analysis.In recent years,the combined application of Y-STR marker and capillary electrophoresis(CE)detection technology had gained outstanding achievements in the uncovering of many cold cases and extremely serious crime cases.Therefore,the Y-STR marker had been paid more and more attention by forensic scientists and first-level policemen.Public security bureaus throughout the country successively initiated the Y-STR database establishment project in 2017.Till now,we may have the largest Y-STR database in the world,and the total number of haplotypes in the Y-STR database was more than 30 million.Y-STR had very extensive application and unique value in assisting criminal case investigation.However,the speed of fundamental scientific research on Y-STR marker was far less than the speed of Y-STR database establishment in China.The depth of our fundamental scientific research on Y-STR marker was far from the breadth of Y-STR database application in case investigation.Therefore,lots of problems were observed in the process of both Y-STR database establishment and practical applicationOn the one hand,when it comes to making decisions on what kind of Y-STR marker would be appropriate for database establishment and case investigation and how many Y-STR markers would work,scientific evidences were rarely used and the allelic association between Y-STR markers or the genetic differences between populations have almost never been considered.Generally,marker selection and marker set screening were based on single-marker characteristics like gene diversity(GD),number of alleles or mutation rates.Even so,our understanding of these single-marker properties was not completely correct.For example,when we selected Y-STR markers,it was generally believed that the marker with bigger GD value was better.Those markers with large GD value were usually selected preferentially into marker set.In addition,we take it for granted that more markers got better results for database establishment and case investigation.Therefore,as long as the system could accommodate,we always tried our best to incorporate as more as possible markers into a single multiplex system.However,we haven't done a systematic study on these issues.In particular,is it true that if we arbitrarily combined those markers with large GD values together,the discrimination capacity(DC)of the finally obtained system must be the greatest?On the other hand,with the extension of haplotypes in Y-STR database,the number of matched reference sample was accordingly increased when we uploaded the haplotypes of biological remains from crime scene to Y-STR database for searching.Maybe some of the matched reference samples were scattered in different local Y-STR database,which increased not only the workload of case investigation but also the difficulty and complexity at the same time.Facing such an awkward dilemma,the most commonly used method was to further exclude the matched reference samples by detecting more Y-STR markers,and to narrow down the search scope as much as possible.However,because we had not adequately studied the allelic association between these Y-STR markers,usually the additional markers didn't achieve the purpose of effectively excluding the matched samples or narrowing the search scope.In other words,when the additional markers were combined with the existing markers in the Y-STR database,the DC of the combined markers had no significant increaseIn addition to the number of loci,the DC of Y-STR marker set was affected by the polymorphism characteristics of loci,the genetic background of the tested population and the allelic association between loci.So,it is urgent to introduce a novel screening method for Y-STR marker set.In this method,the allelic association between loci must be considered and the redundant information between loci should be reduced as much as possible.The Shannon's entropy in information theory seems to have this potential and can be applied to the selection of Y-STR loci.Shannon's entropy is a measure that is commonly used in information theory for assessing the average unpredictability in a random variable.This concept can easily be applied to describe the information content at a locus or in a particular genomic region by substituting states with the respective alleles at that locus or the respective haplotypes in that regionTherefore,in the present study,we decided to conduct a preliminary study on the polymorphism of markers,the allelic association between Y-STR markers,and the influence of genetic differences between populations on markers.Furthermore,we tried to screen a suitable forensic Y-STR marker set for the target population based on Shannon's equivocation.We hope that this study can provide reliable scientific basis for the application of Y-STR typing in forensic investigation and offer effective solutions to some practical problems encountered in the application.Methods1.First,we selected Y-STR markers from published literature,then designed PCR amplification primers using reference sequences in GenBank(?)database as templates,and finally constructed a multiplex PCR amplification system Y-STR 34plex based on six-dye fluorescent and CE platforms2.We evaluated and validated the species specificity,accuracy,inhibitor tolerance,sensitivity,and capability in mixture samples of the Y-STR 34plex.3.Blood samples were collected from 3182 unrelated healthy males including Yulin Han(n=229),Hunan Han(n=400),Hunan Miao(n=666),Hunan Yao(n=611),Hunan Dong(n=643)and Hunan Tujia(n=633).The PCR amplification reactions were performed on GeneAmp(?)9700 thermal cycler by using Y-STR 34plex and AGCU Y SUPP multiplex system.4.The GD value,entropy and other forensic evaluation parameters were calculated for each Y-STR marker,and the allelic association between Y-STR markers were evaluated by use of the NED.5.We screened suitable marker sets for six given populations using three different marker selection approaches.The first approach was based on Shannon's equivocation,the second and third selection procedures selected markers according to the GD and entropy of single-locus,respectively.6.We compared the differences of NED and marker set in six populations,and compared the forensic parameters(DC,HD,MP and FUH)of different marker sets which were screened by three different marker selection approaches.Conclusion1.In this study,we constructed a novel multiplex PCR amplification system Y-STR 34plex by systemically designed experiments.We carefully adjusted the primers and repeatedly optimized each component content in the multiplex system.The Y-STR 34plex could simultaneously amplify 34 Y-STR markers in a single PCR reaction including 31 single-copy markers(DYS533,DYS596,DYS518,DYS393,DYS448,Y_GATA_H4,DYS444,DYS481,DYS439,DYS389 I,DYS438,DYS570,DYS456,DYS458,DYS392,DYS645,DYS390,DYS447,DYS460,DYS627,DYS576,DYS449,DYS593,DYS635,DYS389 ?,DYS557,DYS549,DYS19,DYS643,DYS437 and DYS391)and 3 multi-copy markers(DYF387S1 a/b,DYS527 a/b and DYS385 a/b).Validation study demonstrated that the Y-STR 34plex had a superior performance on species specificity,accuracy,inhibitor tolerance,sensitivity,and mixture samples2.The forensic parameters(GD,entropy,et al.)of the 46 Y-STR markers as well as NED were significantly different among six populations3.We introduced a novel marker selection procedure based on Shannon's equivocation to screen suitable marker set for forensic investigation.In different populations,both the priority of the selected markers and the construction of the marker sets based on Shannon's equivocation were significantly different.4.When the number of markers was equal,the marker sets based on Shannon's equivocation got better performance than that screened according to the GD or entropy of single-locus when measured by parameters such as joint entropy,DC,et al.5.Obviously,it's not a very wise idea to pursue the extremely high DC of a Y-STR marker set.Furthermore,it's not an ideal way to obtain a marker set with best discrimination at minimal cost if we only relied on the polymorphism of single-locus or immoderately increased the number of markers.6.In order to screen the most suitable marker set for the target population,the scope of a marker panel should be expanded as wide as possible.7.When screening Y-STR marker set for the target population,we should take into account the genetic characteristics of the population,population size and other influencing factors.Only by comprehensively considering the polymorphism of candidate markers and the allelic association between markers could we screen the suitable marker set for target population.
Keywords/Search Tags:Y-STR, multiplex PCR amplification system, Shannon's equivocation, allelic association, forensic DNA analysis
PDF Full Text Request
Related items