Font Size: a A A

Research On SNP And APA Associated Gene Expression Regulation

Posted on:2021-07-14Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y WangFull Text:PDF
GTID:1480306353977539Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
With the successful completion of the Human Genome Project,sequencing technique has made a great breakthrough.Using high-throughput technologies to generate huge amounts of biological data in a short period of time,including genome,transcriptome and proteomic data,even to obtain multi-omics measurements at the single-cell level.This information is increasingly being used to solve various biological problems.For such large-scale omics data,using computer technology and mathematical models to integrate and analyze different types of omics data,and systematic analysis of biological samples has gradually become an inevitable trend in bioinformatics studies.Gene expression is a complex biological process.Almost all dynamic functions in a living organism depend on the process of decoding an m RNA message into a protein.There are many factors that affect gene transcription and post-transcriptional regulation.Among them,SNP and alternative polyadenylation(APA)are both important factors.The precise analysis of gene expression and protein expression regulation mechanisms is helpful to understand the difference from genotype to phenotype between human cells even individuals,to systematically explain the cause of disease and provide diagnosis and treatment methods.Based on high-throughput sequencing data of bulk transcriptomics,single-cell transcriptomics data(sc RNA-seq),we combined with genetic variation information and mass spectrometry-based quantitative proteome data,using computational methods and mathematical models to comprehensively identify the APA events for whole genome and explore the genetic regulatory sites that associated with gene expression and protein expression,gradually analyze the regulatory patterns of genetic variation and alternative polyadenylation which involved in gene expression and protein expression.Firstly,based on the latest single-cell transcriptome sequencing data,avoiding the interference of factors such as APA heterogeneity and SNP among individuals,we studied the effect of APA on gene expression regulation.Using computational approaches and statistical methods to identify APA events at the single-cell level,the results indicate that most brain cells tend to express distal APA isoforms,and the dynamic changes of APA between different cell types were observed.The results showed that APA has the significant effect on regulating gene expression in cells.Secondly,integrating genomics and transcriptomics data based on high-throughput sequencing,we studied the gene expression regulatory mechanism associated with SNP and APA at the RNA level.We detected APA events in the whole genome,identified potential QTLs,proposed and constructed maximum likelihood models based on biological hypotheses to achieve accurate prediction and analysis of the six regulatory relationships of SNP,APA,and gene expression.In this article,it is found that gene expression is regulated by both SNP and APA in cis and independently.The results showed that the analysis of the two factors combining APA and SNP improves the accuracy of identifying key regulatory factors of gene expression and their regulatory pathways,which is helpful for comprehensive accurately interpret gene expression regulation mechanisms.Thirdly,we integrate genomics,transcriptomics,and proteomics data to study the regulation of SNP on gene and protein expression.By data mining,novel protein expression quantitative trait loci have been identified.Maximum likelihood models was constructed to achieve accurate prediction and analysis of the regulatory relationship between SNP,m RNA and protein.It was found that there are mutilpe SNP regulatory patterns of protein expression in the human genome.we verified the rationality of the results of this article using mouse data.This result provided possible explanations for the difference between the transcriptional levels and protein expression levels,understanding accurately of protein expression regulatory mechanisms.Finally,integrate multiple omics data to study the extensive involvement of SNP and APA in the regulation of the gene and protein expression.Through comprehensive data mining,many novel quantitative trait loci have been found.In this article,48 regulation models associated with SNP,APA,m RNA and protein expression were proposed based on biological background,and a structural equation model(SEM)based on pathway analysis was constructed to achieve the prediction and annalysis of regulation patterns.It is found that in addition to SNP,APA is also extensively involved in protein expression regulation.APA tends to indirectly regulate protein expression levels through transcriptional mechanisms.The results accurately analyze genes and protein expression variations from the perspective of systems biology,and further enrich and improve the study of gene transcriptional and post-transcriptional regulation mechanism.
Keywords/Search Tags:High-throughput technologies, Single-cell, Multi-omics, Gene expression regulation, Alternative polyadenylation, Protein expression regulation, Maximum likelihood model, Structural equation model
PDF Full Text Request
Related items