| Humans have gradually entered the period of precision medicine.Therefore,it is necessary to analyze the genetic differences of different individuals in order to accurately diagnose patients at the molecular level and provide a detailed reference for clinical treatment and correct medication.In recent years,the study of single nucleotide polymorphisms(SNP)has provided insights for disease warning,genetic counseling,early diagnosis,prognosis evaluation and drug selection,but the currently discovered SNP can only explain the disease a small part of heritability,to further study the impact of nucleotide variants on disease and inheritance,should increase the efforts to study other variants.Multi-nucleotide variants(MNVs)refer to the coexistence of two or more adjacent nucleotide variants on the same haplotype in an individual,which may function completely different from the individual variants that compose it.So far,research on MNVs is very limited,it is often impossible to accurately annotate MNVs using traditional genetic variant annotation tools.Therefore,the development of MNVs batch annotation software is of great significance to promote the development of the MNVs field.In this paper,by extensively collecting MNVs data and various functional element regions of the genome,designing algorithms,and developing a bioinformatics software MNVAnno for functional annotation of MNVs in different genomic regions.This software has three functional annotation modules: annotation based on coding gene region,annotation based on non-coding gene region and annotation based on regulatory region.Coding gene region annotation can analyze the effect of MNVs on amino acid sequence and alternative splicing;Non-coding gene region annotation can identify MNVs located on non-coding RNAs(such as: mi RNA,lnc RNA,sno RNA),and predict MNVs on nc RNA functional effects;Regulatory region annotation can identify MNVs located on regulatory elements(such as: transcription factor binding sites,enhancer),and predict the effects of MNVs on some regulatory elements.Users can submit MNVs files and select different parameters to complete MNVs functional annotation.After the software development,we applied MNVAnno to the 6,261,326 MNVs identified by Wang et al using gnom AD genomes and exomes data,annotating them into coding,non-coding and regulatory regions.The study found that 41.88% MNVs were located in coding gene region,of which 8,505 MNVs were located in the same codon and resulted in missense variants,218 MNVs were located in the same codon and caused stop gain,14 MNVs were located in the same codon and caused stop loss,and 12 MNVs were located in the same codon and caused start loss,through the GWAS database,we searched MNVs annotated to the same codon and generated missense variants and confirmed that248 single nucleotide variants(SNV)landed on these MNVs;58.86% MNVs were located in non-coding RNA region,of which 36.0603% MNVs were located in lnc RNA region,17.7273% MNVs were located in circ RNA region,5.0613% MNVs were located in pi RNA region,0.0075% MNVs were located in t RNA region,and 0.0042% MNVs were located in mi RNA region,0.0029% MNVs were located in sno RNA region,we also searched for MNVs annotated to lnc RNA and circ RNA regions,found that 757 and 34 SNV reported by GWAS database were identified to landed on these MNVs;37.63% MNVs were located in regulatory region,of which 32.4076% MNVs were located in chromatin accessibility region,3.3334% MNVs were located in conserved genomic element region,1.2631%MNVs were located in transcription factor binding site region and 0.6249% MNVs were located in region of enhancer region,0.0047% MNVs are located in the mi RNA binding site region.The development of MNVAnno will provide other researchers with an important research tool that help advance the field of MNVs. |