Font Size: a A A

Research On The Regulatory Mechanisms Of Pre-mRNA Alternative Splicing

Posted on:2010-06-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:X WangFull Text:PDF
GTID:1100330332960510Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
Pre-mRNA alternative splicing is one of the most pivotal procedures during post-transcriptional gene regulation of eukaryotes. The splicing patterns of the same pre-mRNA could be different, hence the so-called "alternative splicing". Due to the universality, complexity and pathogenicity nature of alternative splicing, the underlying regulatory mechanisms are placed more and more emphasis in Life Science. However, previous studies were always limited within a handful of gene models; therefore, the universal mechanisms of alternative splicing are still unclear.The in-depth development of microarray technologies and the emerging of next-generation high-throughput sequencing devices make it possible to study the regulatory mechanisms of pre-mRNA alternative splicing on a transcriptome-wide scale. Here we summarize several bottle-necks in this field, and look into each problem based on high-throughput biochemical data and mathematical models or simulation algorithms. The following main topics are included in this thesis:In different tissues, tissue-specific alternative splicing is applied to generate tissue-specific mRNA and proteins, but how to predict the regulatory factors of tissue-specific alternative splicing and their functions is still an unsolved problem in this area. Based on exon array data, this thesis proposes a mathematical model to predict the cis-acting RNA elements regulating tissue-specific alternative splicing as well as their functions. In this model, we first consider the splicing index of differentially expressed exons between different tissues as the combinatorial effect of multiple regulators located in four nearby regulatory regions, and predict the cis-acting elements and their relative functional levels. Then we estimate the tissue-specific functional levels of predicted cis-acting elements in different tissues. It is proved by the application on exon array data of 11 human tissues that this model has great potential for tissue-specific regulator prediction. The precise prediction of cis-acting elements depends on the research of the features of binding regions of trans-acting elements—RNA binding proteins that are responsible for pre-mRNA splicing. It remains a challenging topic in this field that how to integrate biochemical experiments and computational models to do accurate analysis of protein-RNA binding patterns transcriptome-widely. To overcome this problem, we developed a series of methods to analyze the binding properties of RNA binding proteins. On the basis of CLIP-seq data for SFRS1 proteins, the genome-wide binding regions and classes of SFRS1 proteins as well as the significant molecular functions of SFRS1-regulated genes were obtained. The binding sites of SFRS1 proteins in the whole genome are predicted, and also the relationships between SFRS1 binding sites with splicing sites are discussed. In the end, we investigate to see if SFRS1 proteins are involved in genetic diseases.During CLIP experiment, RNase A/T1 is used to digest transcripts into small RNA fragments, but it is unclear whether or not RNase A/T1 has sequence specificities. Based on the digestion mechanism of RNase, we figure out how to predict binding sites of RNA binding proteins. A statistics-based method is proposed to evaluate the sequence specificities of RNase in CLIP experiment. Then a simulation algorithm is invented to retrieve the regions that are most likely bound by trans-acting elements by the correlation coefficient between simulated and real data. In the implementations on genomic regions containing one and two binding sites, the algorithm turned out to have good prediction power.Traditional algorithms and models for RNA-protein binding sites prediction ignored the impact of RNA secondary structure on the recognition of RNA by proteins. In this study, we bring RNA folding features into binding site prediction of RNA binding proteins, and establish a model based on statistical mechanics to predict RNA-protein binding sites. When searching parameters of the proposed model, we adopt a nested optimization algorithm, in which, the optimal reference motif and its corresponding optimal parameters are optimized at outer and inner optimization layer respectively. Implemented on CLIP-seq sequences of SFRS1 proteins, this model accurately predicted the optimal sequence consensus and the optimal unpaired probabilities of SFRS1 binding sites. The prediction power of our model was proved to perform better than the model which only considered sequence information.
Keywords/Search Tags:pre-mRNA alternative splicing, cis-acting element, exon array, high-throughput sequencing, protein-RNA binding sites, quantum particle swarm optimization algorithm
PDF Full Text Request
Related items