| BackgroundThere are abundant biological resources in the ocean.Its huge biodiversity and unique ecological environment make marine organisms an important source of new drug discovery.Sea snakes are a kind of highly toxic elapids living in coastal shallow water.Hydrophis cyanocinctus is one of the dominant snake species in China.It is of high medicinal value and has been collected by many Pharmacopoeia,such as compendium of Materia Medica.It has the functions of dispelling wind,drying dampness,dredging collaterals,activating blood circulation,nourishing and strengthening.Modern studies have shown that the venom and tissues of H.cyanocinctus contain a variety of proteins and active peptides with anti-inflammatory,anti-rheumatic,antibacterial,anti-tumor and analgesic effects.However,the specific active molecular monomers and their targets and mechanisms of action are not clear.In the early stage,our team established a platform for screening anti-inflammatory molecules targeting human tumor necrosis factor-α(TNF-α)and its receptors(TNFRs),and obtained a 22-mer peptide Hydrostatin-SN1 targeting TNFR1.SN1 has a certain anti-inflammatory activity in vivo and in vitro.However,because it can combine with both TNFR1 and TNFR2,which may bring side effects,its target selectivity is not strong and its druggability is not high.In order to improve the selectivity of the peptide to TNFR1,we obtained the decapeptide Hydrostatin-SN10 through sequence truncation and screening experiments.SN10 can specifically bind to TNFR1 and selectively antagonize the interaction of TNF-TNFR1 in vitro.It has significant anti-inflammatory effect on DSS and OXZ-induced acute colitis in mice,and shows good pharmacokinetic properties without obvious toxicity.Compared with anti-TNF monoclonal antibody and other macromolecular preparations,SN10 can not only block the pro-inflammatory and harmful signals transmitted by TNFR1,but also retain TNFR2-related signal pathways which promote cell proliferation and inhibit inflammation,and thus reduce side effects.Therefore,SN10 has good target selectivity,anti-inflammatory activity and safety in vitro and in vivo.It is expected to become an innovative marine peptide drug for the treatment of TNF-αrelated diseases such as IBD.However,whether SN10 is also selective for TNFR1 in vivo has not been verified.At the same time,the selective mechanism,antagonistic mode and binding sites of SN10 to TNFR1 are not clear,which hinder the in-depth understanding of the anti-inflammatory molecular mechanism of SN10 and the further structural optimization and transformation of SN10.In addition to Hydrostatin-SN10,we screened a variety of anti-inflammatory peptides from the venom gland phage display library of H.cyanocinctus based on the targets TNF-α/TNFRs.No homologous and similar functional sequences were found in the public databases,indicating that H.cyanocinctus has unique anti-inflammatory peptides which have the potential to develop into anti-inflammatory drug candidates.We expect to find more bioactive peptides from H.cyanocinctus,but we encounter two bottlenecks:one is that the biopanning method based on phage display technology has the disadvantages of high false positive,long screening time span,limited library coverage at the transcriptome level;the other is that it is difficult to collect sea snake samples,and it is almost impossible to enrich the active components of snake venom.Moreover,marine biological samples may damage the ecological balance of the ocean.H.cyanocinctus has been listed as a national second class protected animal.In recent years,the rapid development of biotechnology has greatly reduced the cost and cycle of high-throughput sequencing,and advanced sequencing technologies such as next-generation sequencing(NGS),third-generation sequencing(TGS)and chromosome conformation capture(Hi-C)have appeared.More and more multi-omic information of marine life has been sequenced,which has brought great impetus to the research and development of marine drugs.Since the active components of snake venom are mostly proteins/peptides with small molecular weight,and the whole genome contains the coding gene sequence of all proteins/peptides,including those with low abundance but may have unique pharmacological activities in snake venom.This provides the possibility for systematically mining potential bioactive peptides from multi-omic big data based on sequence information.However,no genome sequence of H.cyanocinctus has been reported,and most of the genome studies of other sea snakes are based on next-generation sequencing technology,which has a poor assembly quality and may greatly affect the integrity of active molecules discovery.Therefore,it is urgent to use the high quality multi-omic data of H.cyanocinctus to dig the bioactive peptides in it comprehensively,efficiently and accurately.ObjectiveIn this project,we will first confirm the specific selectivity of Hydrostatin-SN10 to TNFR1 in vivo by gene knockout animal model,construct the structure model of SN10-TNFR1 complex,analyze the binding sites of SN10 with TNFR1,and verify the key amino acid sites on the peptide by in vivo and in vitro activity experiments,so as to comprehensively and deeply reveal the antagonistic mechanism of SN10 against TNFR1and guide the further structural optimization and transformation of SN10.On the other hand,by combining the third-generation sequencing,the next-generation sequencing and Hi-C technology,we sequenced and assembled the high-quality genome,venom gland transcriptome and venom proteome of H.cyanocinctus,annotated all the gene and protein sequences,and identified the toxin-related genes perfectly,so as to establish a high-quality multi-omics database and venomics database of H.cyanocinctus.It can provide database support for high-throughput,systematic and accurate mining of new bioactive peptides from H.cyanocinctus.Methods1.Exploration and verification of the mechanism of anti-inflammatory peptide Hydrostatin-SN10 against TNFR1(1)The acute colitis models of wild-type,TNFR1-and TNFR2-knockout mice were established by DSS(dextran sulfate sodium)induction.The effects of SN10 in different mice were observed and compared from the indexes of disease activity index,body weight,colon length,spleen weight index,expression of inflammatory factors,pathological changes of colon tissue and changes of NF-κB and MAPKs signaling pathways downstream TNF-TNFRs,in order to determine whether SN10 is selective to TNFR1 in vivo.(2)The three-dimensional structure model of SN10 was constructed by Rosetta software,and the main structures of SN10 were obtained by molecular dynamics simulation using Upside and GROMACS software.(3)The molecular docking of SN10 and TNFR1(PDB:1ncf)was carried out by ZDOCK software,and molecular dynamics simulation was carried out by GROMACS to obtain the structure model of TNFR1-SN10 complex and calculate its binding free energy.Discovery Studio was used to analyze the binding interface and interaction amino acids between SN10 and TNFR1 in the structural model.Alanine mutation was performed on the possible binding sites of SN10 and the mutant peptides were synthesized.(4)The affinity of SN10 and its mutant peptides with TNFR1 was detected and compared by MST(micro thermophoresis),and the in vitro activity of each mutant peptide was observed.(5)The mice model of acute shock induced by lipopolysaccharide(LPS)was established.The SN10-treated and each mutant peptide-treated groups(800μg/kg)were set up.The survival rate of mice in each group was observed,and the in vivo activity of each mutant peptide was compared with that of SN10.2.Multi-omic sequencing and assembly of H.cyanocinctus(1)Next-generation genome sequencing and genome survey:Genomic DNA was extracted from the muscle tissue of the sea snake,and a 350 bp insert library was constructed.The sequencing was performed on Illumina Hi Seq X Ten platform.After quality control,the original data were analyzed by k-mer to estimate the size,heterozygosity and repeat rate of the genome.(2)Third-generation genome sequencing:Construct a 20 Kb large fragment SMRTbell library of the muscle tissue of H.cyanocinctus,and sequence 10-11 SMRT cells on Pacbio Sequel platform.(3)Hi-C sequencing:The Hi-C sequencing library(300-500 bp)of muscle tissue was constructed.After passing the quality control,the sequencing was performed with Illumina Hi Seq X Ten sequencer.The original data were filtered by fastp to obtain high-quality clean reads.(4)Genome assembly and quality assessment:According to the results of genome survey and analysis,we used Falcon,Falcon unzip and other softwares to pre-assemble the third-generation data of Pacbio,and got the assembly results of contig level.Then,the three-generation reads are used to correct the genome.After that,Hi-C was used to assist genome assembly and construct chromosome-level scaffolds.Finally,Pacbio third-generation data were aligned back to the genome assembled by Hi-C to fill the gap region in the genome.After gap filling,the third-generation data is used for one round of error correction and the next-generation data is used for two rounds of error correction to get the final assembly version.BUSCO was used to evaluate and compare the genomic integrity of H.cyanocinctus and other representative snakes.(5)Iso-Seq:The total RNA of the muscle and the venom tissues of sea snake were extracted and mixed together,and a 10 Kb SMRTbell library was constructed.One SMRT cell was sequenced on Pac Bio Sequel platform.The Iso Seq software is used to analyze the quality control of the original data,and the high quality isoform sequence is obtained.Then,error correction was performed using the second-generation data,and the Unigene sequence is obtained by CD-HIT clustering.(6)Second-generation sequencing of transcriptome:Total RNA was extracted from the venomous gland tissues of three sea snakes,and the second-generation sequencing library was constructed by oligo(d T)enrichment method,and the sequencing was performed on Illumina Hi Seq X Ten sequencer.After the quality control of the original data,the transcripts were assembled by Trinity.(7)Venom proteome:Extract the total protein in the venom sample,take a part for protein concentration determination and SDS-PAGE detection,take another part for trypsin enzymolysis and TMT labeling,and then mix the labeled samples with the same amount for RPLC separation,and finally carry out LC-MS/MS detection and peptide data analysis on the samples.3.Bioinformatic analysis of the multi-omics of H.cyanocinctus(1)Genome annotation:Firstly,we use EDTA pipeline to predict repetitive sequences.After that,the MAKER pipeline was used to annotate coding genes based on the evidence of homology and de novo prediction by using various softwares such as BRAKER,Augustus and Gene Mark-ES.All the protein sequences obtained from structure annotation were aligned to functional databases such as NR,Swiss prot,KEGG,GO and so on,and the proteins with the highest sequence similarity were used to obtain functional annotation information.(2)Gene family clustering:Through the Ortho Finder process,the protein sequences of related species are clustered based on sequence similarity.(3)Phylogenetic analysis:Multiple sequence alignment of genes in each single-copy orthologous gene family was carried out.The alignment results were combined and connected,and the phylogenetic tree was constructed by using the maximum likelihood method using the software RAx ML.The divergence time was estimated subsequently.(4)Gene family expansion and contraction analysis:CAFE was used to simulate the expansion and contraction events of gene families in each lineage of the phylogenetic tree.For expanded and contracted genes,GO and KEGG pathway functional enrichment analysis was performed,and the significance of enrichment in each item was calculated by hypergeometric distribution test.(5)Positive selection analysis:According to the results of gene family clustering,for each common single copy orthologous gene family,the codeml program was used to analyze the positive selected genes through the branch-site model and chi-square test,and functional enrichment analysis was performed.(6)Genomic synteny analysis:The CDS sequences of the longest transcripts of all genes of the two snakes were aligned to obtain the homologous gene pairs,and then the synteny blocks were obtained by MCScan X.The python package JCVI is used to visualize the synteny results.(7)Identification of toxin-related genes:all the protein sequences annotated from the genome of H.cyanocinctus were aligned to the known toxin-related protein library by BLASTp.After the redundancies were removed,the proteins belonging to the same gene were combined.Then,the annotation information of Swiss-Prot was used to further analyze these genes to identify the toxin genes and venom protein genes.(8)Expression of toxin-related genes:The clean reads of RNA-Seq were aligned to the annotated genome of H.cyanocinctus,and the read counts of each gene in each sample was obtained,and then the gene expression was calculated using the fpkm algorithm.(9)Expression of toxin-related proteins:The peptides identified from the venom proteome were searched against the annotated proteins of H.cyanocinctus,and the credible proteins were identified.Then,according to the annotation results of toxin-related genes,the expression of toxin-related proteins in the venom was further analyzed.Results1.Exploration and verification of the mechanism of anti-inflammatory peptide Hydrostatin-SN10 against TNFR1(1)In TNFR2-/-mice,SN10-treated group was significantly(p<0.05)better than the model group in disease activity index,body weight,colon length,spleen weight index,expression of inflammatory factors,colonic histopathological changes and phosphorylation activation of inflammatory-related signaling pathways downstream TNF-TNFRs.However,SN10 had no significant effect on these indexes in TNFR1-/-mice.(2)Four structural models of TNFR1-SN10 complexes were obtained by molecular docking and molecular dynamics simulation.The possible binding sites of SN10 to TNFR1were D1,E2,E8,L9 and H10.In Model 1,SN10 binds to the CRD2 and CRD3 domains of TNFR1,and the E8 site of SN10 forms electrostatic interaction and hydrogen bond with R77 and N110 site of TNFR1,respectively.(3)The MST determination results of the affinity of SN10 and all mutant peptides with TNFR1 showed that the in vitro activity of M1(E8A)was significantly changed,and the Kd value(>171μM)of M1 was 62-fold lower than that of SN10(2.75μM).(4)The survival rate of mice treated with M1(12.5%)was 75%lower than that of mice treated with SN10(87.5%)(P<0.01)in LPS-induced acute shock model.2.Multi-omic sequencing and assembly of H.cyanocinctus(1)Through the third-generation sequencing,the second-generation sequencing and Hi-C sequencing,we obtained a 1.98 Gb genome and assembled 18 chromosomes(7macro-chromosomes and 11 micro-chromosomes).The results of assembly quality evaluation showed that the contig N50 was 18.99 Mb,scaffold N50 was 264.25 Mb,gap ratio was 0.01%,and BUSCO integrity was 90.1%.Compared with the published snake genomes,the assembly quality of our H.cyanocinctus genome is the highest,and compared with other sea snake genomes based on second-generation sequencing,the continuity was significantly improved(>100 times).(2)18117 full-length transcripts with a maximum length of 12.1 Kb were obtained by the full-length transcriptome sequencing,and 54344 transcripts were obtained by the second-generation transcriptome sequencing and assembling.(3)7709 peptide sequences in the venom proteome were obtained by TMT-labeled quantitative proteomic analysis.3.Bioinformatic analysis of the multi-omics of H.cyanocinctus(1)23898 genes encoding 43062 proteins(transcripts)were obtained by genome annotation.(2)Repetitive sequence analysis showed that the LTR retrotransposons were significantly expanded during the evolution of the genome of H.cyanocinctus.(3)Phylogenetic analysis showed that there was a close relationship between H.cyanocinctus and terrestrial elapids.True sea snakes evolved from terrestrial elapids about19.9 million years ago,while H.cyanocinctus emerged about 9.5 million years ago.(4)Evolutionary analysis of gene(family)showed that 907 gene families expanded and 1565 gene families contracted in the genome of H.cyanocinctus,which were mainly related to vomeronasal and olfactory receptor system,ion transmembrane transport,hearing and visual perception,energy metabolism and innate immune response;80 genes were under positive selection,which were mainly related to DNA mismatch repair,m RNA alternative splicing,respiratory chain,oxygen sensing and other functions.(5)Through the genomic synteny analysis,we identified the fifth chromosome of H.cyanocinctus as the sex chromosome(Z chromosome).(6)241 toxin-related genes from 35 toxin-related families,including 60 toxin genes,were identified in the genome of H.cyanocinctus.The representative toxin-related families are 3FTx,PLA2,CRISP,v KUN,v CTL,SVMP,etc.3FTx has the most copies of toxin genes(20),including 5 LNTX genes and 9 SNTX genes.At the same time,some genes encoding potential active molecules were found,including antibacterial peptide(CATH),nerve growth factorβ(NGF-β)and phospholipase A2 inhibitor(PLI-γ).(7)Quantitative analysis of the transcriptome and proteome of venom showed that3FTx and PLA2 were the two most highly expressed toxins in the venom.Among 123toxin-related genes,3FTx accounted for 45%of toxin-related transcripts,and LNTX(30.8%)was higher than SNTX(14.1%);PLA2 make up 39%of the transcripts.69 toxin related-proteins were identified in the venom proteome,including 24 toxins.Three LNTX toxins and one SNTX toxin were detected in 3FTx family.ConclusionIn this study,we first confirmed that the anti-inflammatory peptide Hydrostatin-SN10of H.cyanocinctus has specific selectivity for TNFR1 in vivo.Further study on the antagonistic mechanism of SN10 against the target TNFR1 showed that SN10 may occupy the key binding sites of TNF-α-TNFR1 through E8(the 8th amino acid glutamate),directly competitively antagonizing the binding of TNFR1 with TNF-α,thus inhibiting the formation of TNFR1-TNF complex and the activation of downstream signals.It is very important to optimize the structure of SN10 and to screen the leading active peptides with similar specific targets from marine organisms.On the other hand,we have obtained the high-quality genome,full-length transcriptome of venom gland,and proteome of venom of the medicinal marine organism H.cyanocinctus.We have established a well-annotated multi-omic database and a venomics database of H.cyanocinctus,providing big data support for high-throughput and accurate mining of new bioactive peptides from H.cyanocinctus;Meanwhile,it is also of great significance for the protection and utilization of marine endangered species such as sea snakes and the development of marine innovative drugs. |