Font Size: a A A

Identification Of Small Open Reading Frames With Coding Potential In Human Genome Through Multi-omics Approaches Combined With Bioinformatics Analysis

Posted on:2022-10-02Degree:DoctorType:Dissertation
Country:ChinaCandidate:J C DingFull Text:PDF
GTID:1520306335469204Subject:Chemical Biology
Abstract/Summary:PDF Full Text Request
Large-scale transcriptome analysis showed that the transcribable part of the genome is far more than previously thought,resulting in a large number of various RNA molecules.The process of identifying the coding part mainly relies on algorithms,which will lead to a substantial underestimation of the coding potential of eukaryotes.There is increasing evidence that some RNA sequences were mis-annotated as noncoding.But these RNA sequences in fact can encode functionally important microproteins,some of which are promising drug targets for anti-tumor growth or biomarkers for prognosis.Recent years,the accuracy and sensitivity of omics data have been greatly improved.These make it possible to discover and verify new proteins through experimental approaches.Here we aim to systematically identify small open reading frames(sORFs)with coding potential in colorectal cancer and breast cancer,and try to identify microproteins specifically expressed in tumor tissues.Methods:(1)Multi-omics data of colorectal cancer cell line HCT116 and breast cancer cell line MCF7 were collected.Pros and cons of each methods were compared.(2)Proteome data of the tumor tissue and adjacent tissue samples of 4 colorectal cancer patients and 4 breast cancer patients were collected.Results:Through polysome profiling analysis 2013 and 2197 non-coding genes with coding potential were identified in HCT116 and MCF7,respectively.Through Ribo-seq analysis,1175 and 307 non-coding genes with coding potential and 7459 and 3919 coding genes that may contain new coding frame were identified in HCT116 and MCF7,respectively.Through proteomics data analysis,406,321,547,and 719 genes that may contain new coding frames were identified in HCT116,MCF7,colorectal cancer and breast cancer tissue samples,respectively.Among them,28 and 22 new ORFs were identified with multi-omics evidence in HCT116 and MCF7 cell lines,respectively.Summary:In this article we combined translatomics and proteomics data,and found some new sORFs with coding potential.These sORFs can be used as further research target,and may become the support and basis for the development of tumor treatment drugs and prognostic biomarkers in the future.
Keywords/Search Tags:Small Open Reading Frame, Microprotein, ncRNAs, Proteomics
PDF Full Text Request
Related items