BackgroundDiabetic kidney disease(DKD)is one of the most common microvascular complications of type 2 diabetes mellitus(T2DM),which has become the major cause of end-stage renal disease in China.DKD is the complex disease induced by the joint action of genetic susceptibility factors and environmental factors,but its molecular pathogenic mechanism remains incompletely clear.Genome-Wide Association Studies(GWAS)have identified a series of single nucleotide polymorphism(SNP)sites related to the incidence risk of DKD,but most of these genetic susceptibility sites have unclear functions,which can not correspond to the molecular biological roles in the genesis and development of DKD.Therefore,how to efficiently mine the existing GWAS data and gradually determine the biological functions of pathogenic sites have become the great challenges encountered by the molecular pathogenic mechanism of DKD.Functional SNPs refer to SNPs sites located in transcriptional regulatory regions(promoters and enhancers)and protein expression translation regions that affect gene expression and protein function.The criteria for determining these functional sites include whether SNPs are located in gene functional positions(transcriptional regulatory regions,gene coding regions,etc.),whether they affect gene expression,and whether they can affect the binding of transcription factors to DNA regulatory elements.It is of great significance to sufficiently apply bioinformatics technology to mine GWAS large database resources,integrate SNPs information in the genome with regulatory data,and optimize the screening of target genes of functional SNPs,so as to efficiently explore the SNPs pathogenic variations and illustrate their roles in DKD genesis and development and the associated mechanisms.This study obtained the SNPs related to the incidence risk of DKD based on the phenotype and genotype integration(PheGenI)database,carried out multiple functional annotation information analysis of SNP sites using the bioinformatics databases,and selected the functional SNPs.In addition,genotyping of functional SNPs,allele frequency,different genetic models,and the associations of SNPs interaction with DKD incidence risk and clinical phenotypes were analyzed in the case-control system of the northeastern Chinese population.Moreover,bioinformatics database analysis combined with clinical sample plasma protein detection experiments was performed to observe the different genotypes of SNPs positively associated with the DKD incidence risk,and the differences in corresponding gene m RNA and plasma protein expression,so as to explore the potential molecular mechanism of SNP sites that affected gene expression.so as to provide scientific foundation and potential molecular targets for constructing the DKD prevention and control strategies at the molecular level,promote molecular etiologic research,and accelerate DKD clinical prevention and treatment as well as early translation.Research methods1.Optimized screening of functional SNPs related to the risk of type 2 DKD:(1)The PheGenI database was used to query SNPs related to the risk of type 2DKD and glomerular filtration rate.(2)Haplo Reg,PheGenI databases were used to comprehensively evaluate whether these SNPs have eQTL information,whether they were within the transcription factor-binding motif domain,whether they possessed the histone modification markers,and whether they were at the DNase I hypersensitive sites(DHS),and regulatory SNPs were selected.(3)Omics Bean and STRING databases were utilized for the functional annotation clustering analysis(GO analysis,KEGG analysis)and protein-protein interaction analysis of these eQTL target genes corresponding to SNPs,so as to explore the target gene functions.2.Correlation between functional SNPs and the risk of DKD in Chinese population:(1)A case-control system for 498 DKD patients in Northeast China was established,including 166 patients with DKD,166 patients with T2 DM,and 166 normal controls.The peripheral blood was collected,the plasma was separated and stored at low temperature,and the DNA of blood cells was extracted.This study has been reviewed and approved by the Ethics Committee of our hospital.(2)By applying the Mass ARRAY flight mass spectrometry,functional SNPs were subjected to genotyping and allele frequency analyses in the DKD case-control system(n=498 cases).(3)SHEsis online software was used to analyze the differences in genotype and allele frequency distribution at each SNP site between the case group and the control group;Logistic regression was adopted to analyze the relation of genotyping with DKD incidence risk under the dominant,recessive,and recessive inheritance models.(4)PLINK software was used to analyze the SNP-SNP interactions and the relation with DKD incidence risk.The generalized multi-factor dimensionality reduction(GMDR)method was applied to search the optimal model involving interactions of multiple SNPs for predicting DKD.3.Correlation between functional SNPs and clinical phenotype of DKD:(1)Basic information,biochemical indexes and other clinical data of the enrolled sample population were collected.(2)The statistical differences in relevant clinical data between DKD group and T2 DM group were analyzed using chi-square test and t-test.The independent risk factors for DKD were screened by logistic regression analysis.(3)The correlation of different genotypes of functional SNPs with clinical phenotype was analyzed by t-test.(4)Using the R language,multiple SNP genotypes and multiple clinical phenotype data from T2 DM group and DKD group were used as the objects of study to detect the DKD risk prediction model by the decision tree approach.4.Analysis of the effects of functional SNPs sites significantly associated with the risk of DKD on the expression of corresponding genes and their regulatory mechanisms:(1)Four functional SNPs(rs6420094,rs4453858,rs594074 and rs10952362)that were significantly associated with the risk of DKD were analyzed using the Genotype-tissue expression(GTEx)database and their relationship with the m RNA expression levels of corresponding genes in different human tissues.(2)The target gene protein expression levels of four functional SNP sites significantly correlated with DKD incidence risk in plasma samples,including X SLC34A1 protein affected by rs6420094,SUCLG2 protein affected by rs4453858,LY86 protein affected by rs594074,and NAPSA protein affected by rs10952362,were detected by enzyme-linked immunosorbent assay(ELISA).(3)The difference of plasma protein expression levels in different groups of the case-control system and their correlation with the clinical phenotype of DKD were analyzed.(4)The differences in plasma protein expression levels in the populations of different SNPs genotypes were analyzed to further determine the impact of the abovementioned SNPs on gene expression.(5)Using the PERFECTOS-APE on-line software and gene expression profile(GEO)database,the regulatory effect of rs6420094 on SLC34A1 gene expression was analyzed.Research results1.Optimal screening results of functional SNPs related to the risk of type 2 DKD:A total of 238 SNPs related to DKD and 40 SNPs related to glomerular filtration rate were queried using PheGenI database.A total of 34 SNPs with regulatory function were identified by optimizing and screening the above 278 SNPs through various biological information databases,all of which included the eQTL information records,and 32 of these SNPs were located within the transcription factor-binding motif domain.In addition,33 SNPs showed enrichment of H3K4me1,H3K4me3,H3K27 ac and H3K9 ac histone modification markers to varying degrees,23 SNPs were located within the promoter or enhancer histone modification regions.Further,16 SNPs not only affected the motif of transcription factor and had histone modification information,but were also located in the DHS region.The optimized screening of these functional SNPs has laid critical foundation for the biological roles of these functional SNPs detected in this study in DKD genesis and development.2.The results of case-control study on the correlation between functional SNPs and the risk of DKD:(1)Genotyping,allele frequency and genetic model analysis of 21 SNPs were successfully completed in 498 DKD case-control studies.The G allele and the AG+GG genotype at rs6420094 of the SLC34A1 gene,and the AA genotype at rs4453858 of the SUCLG2 gene have protective effects on reducing the risk of DKD,all the above P < 0.05,OR < 1;The AA genotype at rs594074 of the LY86-AS1 gene and the CC genotype at rs10952362 of the LINC01003 and RPS20P19 intergenic regions were related to an increased risk of DKD.This study did not find any other SNPs associated with the risk of developing DKD.(2)PLINK software analysis revealed that 13 pairs of SNPs pairwise interactions were related to the incidence risk of DKD.Among them,7 pairs of SNPs pairwise interactions(rs17319721 and rs6420094,rs17319721 and rs594074,rs1260326 and rs903552,rs2780902 and rs6503503,rs4453858 and rs304029,rs4453858 and rs4879670,rs13254600 and rs7975752)were significantly related to the protective effect on DKD incidence risk;all the above P < 0.05,OR < 1;.Moreover,6 pairs of SNPs pairwise interactions(rs17319721 and rs6930576,rs4453858 and rs6432852,rs35716097 and rs955333,rs903552 and rs955333,rs6503503 and rs955333,rs12523822 and rs903552)were significantly related to the increased risk of DKD incidence,all the above P < 0.05,OR > 1.(3)GMDR model analysis found that the best model related to DKD risk was rs6420094-rs1260326-rs903552-rs6503503-rs4453858-rs6432852-rs4879670-rs35716097(CVC=9/10,P=0.011).3.Results of correlation analysis between functional SNPs genotype and DKD clinical phenotype:(1)The polymorphisms of rs7975752,rs594074,rs4453858 and rs4879670 may affect the lipid level of DKD.(2)The polymorphisms of rs304029 and rs6432852 may affect the renal function of DKD.(3)The polymorphisms of rs7975752 and rs1260326 may affect DKD blood pressure level.(4)The indexes introduced into the decision tree construction model included diabetic retinopathy,triglyceride,fasting blood glucose,free fatty acid,glomerular filtration rate,glycosylated hemoglobin,urea nitrogen,fasting insulin level and rs594074 site.The accuracy of the constructed model in predicting the DKD risk in T2 DM patients was 83.7%,the sensitivity was 86.74%,the specificity was 80.72%,and the area under the curve(AUC)of receiver operating characteristic(ROC)curve was 0.885.4.Effects of functional SNP sites significantly related to the DKD incidence risk on the corresponding gene expression and the regulatory mechanism:XII(1)At the gene transcription level,eQTL analysis was conducted on SNPs positively correlated with the DKD incidence risk based on the GTEx database.The results suggested that,GG genotype of rs6420094 site was significantly correlated with the up-regulated SLC34A1 m RNA expression;besides,the GG genotype of rs594074 site was also evidently related to the up-regulation of LY86 m RNA expression.The AA genotype of rs4453858 site was significantly associated with the down-regulation of SUCLG2 m RNA expression(all P<0.05).(2)At the plasma protein levels,the plasma SLC34A1 protein levels in populations of GG,AG,and AG+GG genotypes in DKD group significantly decreased relative to population of AA genotype(all P<0.05).In contrast,the plasma SLC34A1 protein expression in population of rs6420094 GG genotype evidently increased relative to AA genotype in normal control group(P=0.002).This reveals the differences in rs6420094 genotype and differential SLC34A1 protein expression among different populations.The plasma target protein contents in populations of the remaining SNP genotypes did not exhibit any statistical difference.(3)Plasma protein detection results among different populations suggested that the plasma SLC34A1,SUCLG2 and NAPSA protein expression levels in DKD group and T2 DM group were significantly lower than normal control group(all P<0.05).However,the expression levels of the above-mentioned plasma proteins did not show any statistical difference between DKD and T2 DM groups.(4)Upon analysis using the PERFECTOS-APE software,variations of of alleles at rs6420094 site might change the ability of the genomic region of this SNP site to bind multiple transcription factors.Typically,the variation of allele A to G reduced the ability of rs6420094 to bind to GATA2,GATA1 and GATA3 transcription factors by21.65-38.87 folds,and increased the ability of this SNP site to bind to CTCF transcription factor by 8.15 folds.This reveals that variations of alleles at rs6420094 site change the ability to bind to multiple transcription factors to varying degrees.(5)As discovered based on the GSE1009 dataset from GEO database,GATA2 expression significantly increased in DKD tissue samples compared with normal tissue samples,the CTCF expression in DKD tissue samples was lower than that in normal tissue samples.This results combined with the effect of rs6420094 site on the binding of transcription factors GATA2 and CTCF to DNA regulatory element,partially explain the reasons of different rs6420094 genotypes and SLC34A1 protein expression among different populations.To sum up,this study starts from identifying functional SNPs,applies GWAS public database and multiple bioinformatics databases to optimize and screen functional SNPs related to DKD incidence risk.Furthermore,we verify in the DKD case-control system that rs6420094,rs4453858,rs594074 and rs10952362 sites are related to the DKD incidence risk in the northeastern Chinese population.Through the analysis of biological information database and the detection of plasma protein in clinical samples,it is discovered that rs6420094 site is correlated with the SLC34A1 gene expression.It is suggested that this site may affect SLC34A1 gene expression through affecting the binding of transcription factors to the DNA regulatory element.In addition,SNPs genotype and clinical information were integrated to establish a decision tree model for predicting the risk of DKD in T2 DM patients.The functional genetic variations significantly associated with DKD in the Chinese Han population illustrated in this study may become a novel biomarker,which provides a new strategy for the diagnosis and individualized treatment of DKD. |