| Background and Objective:Single nucleotide polymorphism(SNP)is a kind of DNA sequence polymorphism caused by single nucleotide variants in the genome and shows many advantages in forensic ancestry inference.A number of ancestryinformative SNP(AI-SNP)panels have been successfully developed,most of which contain AI-SNP loci that perform well in classical panels;however,new potential loci remain to be explored.Furthermore,the search for AI-SNP sets with high discriminatory power for ancestry inference in inter-and intra-continental populations has become a realistic need.On the other hand,as a member of the Altaic-speaking groups,the Manchu population has developed and grown through a long period of gene exchange and integration with other ethnic groups,resulting in the genetic background with distinctive ethnic characteristics.However,few studies have been reported on the genetic background of the Manchu group in the Inner Mongolia Autonomous Region,China.To this end,this research aims to screen a new set of AI-SNP loci and assess their efficacy for ancestry inference in African,European,Central/South Asian and East Asian populations,and to further validate the selected loci by collecting samples from the Inner Mongolian Manchu(IMM)population and to systematically explore the genetic characteristics of the IMM population.Methods and Contents:The AI-SNPs were screened genome-wide using the four continental reference populations of Africa,Central/South Asia,East Asia and Europe from the 1000 Genome Project phase Ⅲ and the Human Genome Diversity Panel(HGDP-CEPH).Three machine learning models,including multinomial logistic regression model,random forest model and support vector machine model,as well as population analyses,were used to objectively assess the efficacy of ancestry inference of the selected loci.Subsequently,a batch of IMM samples were genotyped using nextgeneration sequencing to assess the efficiency of the selected SNPs to be detected in the actual samples and to further validate the efficacy of ancestry inference.Finally,systematic population genetic analyses were performed on the IMM population based on 79 reference populations from seven continental regions in the two databases mentioned above,to provide insight into the genetic characteristics of the IMM population.Results and Conclusion:In this study,126 AI-SNPs were finally screened and no overlap with loci in the classical AI-SNP panels was found.The results of the machine learning models and population analyses showed that this set of loci was able to identify genetic differences between African,East Asian,European and Central/South Asian populations.Not only can these loci infer the ancestral origins of the three major intercontinental populations(i.e.,African,European and East Asian populations),but they can also further distinguish between Central/South Asian and East Asian populations.The comparison of the performance of the three machine learning models revealed that the support vector machine model in this study had better classification and prediction performance.The sequencing results and validation analysis of the IMM samples indicated that the 126 AI-SNPs had good capacity for analysis and detection of the actual samples and for ancestry inference,which could provide information for forensic ancestry inference.Systematic population genetic analyses based on 79 reference populations from Africa,the Americas,Central/South Asia,East Asia,Europe,the Middle East and Oceania showed that the IMM population collected for this study had the typical genetic characteristics of East Asian populations.In addition,the IMM population was genetically more closely related to the northern Han Chinese and Japanese than to other Altaic-speaking populations.Overall,this study provided a selection of new promising loci for ancestry inference of major intercontinental populations and intracontinental subgroups,as well as genetic insights and valuable data for exploring the genetic background of IMM population. |