| Schizophrenia(SCZ)is a highly heritable,polygenic complex mental disorder with imprecise diagnostic boundaries.Patients with SCZ often need lifelong treatment,but early diagnosis and early treatment may effectively control relevant clinical symptoms before serious complications,and help to improve their long-term prognosis.At present,it is still a challenge to use biomarkers to diagnose and predict the treatment response of mental diseases.It is very urgent to explore new biomarkers that can provide objective evidence for early diagnosis and differential diagnosis.Sensitive and specific biomarkers are not only conducive to the early identification and differential diagnosis of diseases,but also the premise of accurate and personalized treatment;Effective new biomarkers can not only improve the biological homogeneity of disease diagnosis,but also provide support for the development of new drugs,but also need to further carry out high-quality series research.ObjectivesThe purpose of this study is to obtain specific differentially expressed genes in peripheral blood of patients with schizophrenia,identify disease-related key coexpression modules and key genes in the modules,and explore the pathogenesis of SCZ from the perspective of multi-gene interaction;To construct a panorama of the level of blood immune cells in SCZ,and to explore the relationship between blood immune cells and SCZ;The establishment of a diagnostic model for SCZ blood diagnosis combining key genes and immune cell subsets was preliminarily explored,and the key genes were verified in the population,so as to provide a theoretical basis for finding new blood diagnostic biomarkers for SCZ and potential drug treatment targets.Methods1.Three differential gene analysis methods(Limma,DESeq2 and edgeR)were used for differential analysis and intersection respectively to screen the reliable differentially expressed genes(DEGs)in the blood of SCZ patients,and the GO/KEGG pathway and GSEA were analyzed to explore the biological function of DEGs;to construct PPI interaction network screenning key differential genes.2.Using WGCNA method to construct weighted gene coexpression network and identify coexpression gene modules,and using internal and external data sets to verify the module conservatism;Through the study of module trait correlation,the key modules related to SCZ and the hub gene in the module were found,and GO/KEGG pathway were analyzed to identify its biological function and significance;WGCNA method was used to construct the gene coexpression network for the gene expression profile data of SCZ and CTL blood samples respectively,and then the SCZ specific coexpression module was identified and its biological function was discussed based on gene co-existence analysis and topology change.3.Using the methods of systematic review and meta-analysis,we searched MEDLINE(PubMed),EMBASE,web of science,Cochrane Library,CNKI and Wanfang database by computer to collect the published Chinese and English literature on the correlation between blood immune cell level and SCZ(Up to November 2021).According to the pre-determined inclusion and exclusion criteria,the literature was screened,the quality of the included literature was evaluated,and the data were extracted and merged.Then CIBERSORT and xCELL algorithms were used to quantitatively estimate the level of immune cell subsets based on blood gene expression profile data,and their correlation was analyzed.4.Based on GEO database,through the differential analysis of the blood gene expression profile data of three major mental diseases(SCZ,BPD,MDD),the blood specific differentially expressed genes with significant differential expression in SCZ but not in BPD and MDD were screened;It was intersected with DEGs and key module genes obtained from our sequencing data to obtain SCZ blood specific key genes.Then,Jaspar database was used to predict the transcription factors interacting with key genes,and through TarBase V8.0 and MiRtarbase V8.0 databases to predict the miRNAs interacting with it,and construct TFs-genes-miRNAs co-regulatory network through RegNetwork database.CTD and DSigDB database were used to predict the small molecule chemicals and drugs interacting with key genes,and the corresponding regulatory networks were visualized by Cytoscape software.Finally,a case-control study design was used to verify the screened key genes by RT-qPCR.5.Take our RNA-seq data as the training set,jointly verify the characteristics of key genes and lymphocyte proportion,and use 3 machine learning algorithms(RF:random forest;SVM:support vector machine;DT:decision tree)to construct the disease diagnosis model through 10 times,3-fold CV respectively,and take the average value of 10 times results as the final performance result,The optimal model is used as the final diagnostic model,and its diagnostic performance are verified by external data sets to evaluate the diagnostic value of these blood biomarkers.Results1.There was a high correlation among the three differential gene analysis results.A total of 225 reliable DEGs were identified,including 105 up-regulated genes and 120 down-regulated genes;RPS27a was the first differential gene in the central analysis.GSEA analysis showed that the up-regulated DEGs were mainly enriched in protein processing in endoplasmic reticulum,shigellosis and Epstein Barr virus infection;Down regulation of DEGs were mainly related to Alzheimer’s disease,COVID-19,Huntington’s disease,amyotrophic lateral sclerosis,Parkinson’s disease,ribosome and a variety of neurodegenerative diseases.2.A total of 28 gene coexpression modules were identified by WGCNA network,and verified by internal and external data sets,the modules were repeatable;Four coexpression modules significantly correlated with SCZ were obtained through module trait association analysis.Tan module was negatively correlated with SCZ,and turquoise,lightcyan and orange module genes were positively correlated with SCZ.The four coexpression module genes were combined into WGCNA_MEs(2835 genes).Through the correlation analysis between the gene significance(GS)related to disease traits and the gene mm value in each module,it was found that the correlation coefficient between Tan module and mm was the highest,followed by orange module and Turquoise module,while lightcyan module had no correlation;According to the screening criteria of |MM|>0.8&|GS|>0.2,a total of 468 key genes were identified,including 33 Tan modules,16 orange modules and 419 Turquoise modules.These key genes are mainly concentrated in the process of translation initiation,nuclear transcription,mRNA catabolism,viral gene expression,ribosome,COVID-19,various cancers,platinum resistance and PI3K-Akt signaling pathway,EGFR tyrosine kinase inhibitor resistance,JAK-STAT signaling pathway.Lightcyan,grey60 and lightyellow modules are identified as specific coexpression modules of SCZ based on co-existence gene analysis,and darkred and white modules are also specific coexpression modules based on topology change.3.The results of meta-analysis showed that compared with the control group,the blood CD 19 lymphocyte,monocyte and neutrophil counts,CD4 and CD56 lymphocyte percentages,CD4/CD8 ratio,neutrophil/lymphocyte ratio(NLR)and monocyte/lymphocyte ratio(MLR)of SCZ patients were significantly increased(P<0.05).The blood CD3,CD4 lymphocyte and neutrophil counts,CD4/CD8 lymphocyte ratio and monocyte/lymphocyte ratio(MLR)in FEP patients were significantly higher than those in healthy controls(P<0.05).In sensitivity analysis,there was no significant change in heterogeneity and merger results;There was no publication bias between studies.4.CIBERSORT results showed that compared with normal samples,the blood of SCZ patients contained more lymphocytes and monocytes,as well as relatively fewer neutrophils,resting mast cells and activated NK cells;xCELL results showed that a variety of immune cells in the blood of patients with SCZ were significantly upregulated,different cell subtypes showed different change trends,and T lymphocytes were dominant,indicating that immune cells were involved in the occurrence and development of SCZ.5.823 DEGs of SCZ were obtained based on GSE38485,1355 DEGs of BPD were obtained based on GSE124326,192 DEGs of MDD were screened by GSE32280,and 772 blood specific DEGs of SCZ were finally screened;Then,it was compared with the reliable DEGs and WGCNA_MEs obtained from our RNA-seq data took the intersection and obtained 12 SCZ blood specific key genes(KRT1,SNRPG,TMEM14B,TOMM7,AQP10,CLEC12A,LSM3,RPL17,RPL26,RPL9,RPS24 and TRAT1).6.JASPAR database was used to predict TFs interacting with key genes,including 56 points and 89 edges.The main transcription factors were FOCX1,GATA2,GATA3,FOXL1 and E2F1.Using TarBase and MiRTarBase databases to predict the miRNAs interacting with key genes,including 31 points and 53 edges.The main miRNAs are miR-1-3p,miR-27a-3p,miR-126-3p,miR-155-5p and miR-20a-5p respectively.Top 5 suggested chemicals predicted by CTD database were valproic acid,arsenic trioxide,hydralazine,chloropicrin and enzyme inhibitor;Top 5 suggested drugs predicted by DSigDB database are hydralazine,urea,glycerol,trimellitic anhydride and efendil.7.RT-qPCR validation results showed that there were significant differences among 8 genes(SNRPG,AQP10,CLEC12A,TOMM7,TMEM14B,RPL9,RPL26,RPS24);Except TMEM14B,the change trend of other 7 genes were consistent with the RNA-seq results.8.Taking our sequencing data as the training set,combined with the patient’s age,gender,seven key genes and three immune cell ratios(lymphocytes,monocytes and neutrophils),the diagnostic model is constructed.The results show that the RF classifier model has the best diagnostic performance(AUROC:RF,0.855;SVM,0.795;DT,0.845),which can accurately classify the two groups;The ranking results of Gini coefficient variables based on RF show that TOMM7 is the most important characteristic variable,and monocytes have the highest importance in the proportion of immune cells.The external data set verification of the RF classifier model with the best performance shows that the AUROC of the external SCZ queue is 0.820;The AUROC of BPD cohort was 0.586.Conclusions1.This study identified 225 reliable DEGs in the blood of patients with SCZ,including 105 up-regulated genes and 120 down-regulated genes;The differentially expressed genes of top 20 are mainly ribosomal proteins(RPS)genes,interferon stimulating genes,inflammation and immune response regulation genes;and mainly enriched in ribosome,immune system,virus infection,neurodegenerative diseases and other related functions and pathways.2.The gene coexpression network constructed by WGCNA method is robust and reliable.4 coexpression modules significantly related to SCZ(tan,turquoise,lightcyan and orange modules)and 5 blood specific coexpression modules of SCZ(lightcyan,grey60,lightyellow,darkred and white modules)were identified.3.Using SCZ blood RNA-seq data for the first time,the panorama of SCZ blood immune cell level is preliminarily constructed based on CIBERSORT and xCELL algorithms.Blood lymphocytes,monocytes and neutrophils may be related to the occurrence and pathophysiology of diseases,and may be used as a new blood diagnostic biomarker of SCZ.4.We identified several potential blood biomarkers for SCZ,including 7 key genes(SNRPG,AQP10,CLEC12A,TOMM7,RPL9,RPL26,RPS24)and 3 major peripheral immunocytes(lymphocyte,monocyte and neutrophils)to enhance the knowledge of peripheral immune cell infiltration heterogeneity and complexity,providing promising therapeutic targets for SCZ patients.The diagnostic model constructed from RF algorithm using training RNA-seq data achieved the highest AUROC score among 3 classifiers,which achieved good performance with a AUROC of 0.82 in external SCZ cohort,but a lower AUROC of 0.58 in external BPD cohort.Further experimental researches are expected for testing these biomarkers in the study as potential blood biomarkers for early diagnosis and treatment of SCZ. |