Font Size: a A A

Predicting Inhibition Efficacy Of Small Interference Rna And Researching Epigenetic Regulation Of Long Noncoding Rna

Posted on:2015-03-15Degree:DoctorType:Dissertation
Country:ChinaCandidate:L LiuFull Text:PDF
GTID:1260330428482698Subject:Biophysics
Abstract/Summary:PDF Full Text Request
The ENCODE project has reported evidence that a majority of the human genome is capable of being transcribed. The protein-coding transcripts are just a little fraction of that. Most are noncoding transcripts, such as rRNA, tRNA, siRNA, miRNA, piRNA, IncRNA. They constitute a complex regulatory network to precisely regulate gene expression. The observations for these noncoding RNAs prompt a redefinition of the concept of a "gene". In this study, we set to discover new factors that affect siRNA inhibition efficacy and analyse the correlation between epigenetic regulatory and IncRNA transcription.RNA interference is a powerful tool for researching gene function by inhibiting gene expression. As a new gene therapentic strategy, RNAi has been widely used in targeted drug design. Because the large variation in the efficiency of siRNAs for different sites on the same target is commonly observed. For a successful RNA interference (RNAi) experiment, selecting the small interference RNA (siRNA) candidates which maximize the knock down effect of the given gene is the critical step. Although various computational approaches have been attempted, the design of efficient siRNA candidates is far from satisfactory yet. The limited accuracy in predicting siRNA potency might derive from poorly understanding about the silence mechanism. Previous studies have shown that nucleotide composition and thermodynamic stability of siRNA duplex can affect siRNA potency through the upstream effect on RNA-induced silencing complex (RISC) assembly. There is still an argument on the influence of target accessibility which is known as downstream step. Reynolds et al. observed in their siRNA knockdown experiments that properties of the target mRNA did not affect knockdown and that efficacy seems to be solely based on properties of the siRNA. However, other studies have indicated that secondary structure and thermodynamic properties of the siRNA are also important determinants of functionality. At the same time, the importance of target secondary structure and accessibility was supported by compelling evidence based on experimentally assessed accessibility. RNAi is a complex process which is related to several proteins. Some details of RNAi pathway are still not clear such as how RISC finds the target mRNA. Thus, finding the factors that influence RNAi efficacy and rules to design potent siRNAs is an important task.Recently, it has been observed that sequence context outside the target region influences the effectiveness of miRNA. Sun et al. demonstrated that some AU-rich motifs, such as the core sequence of AREs (AU-rich elements)"AUUUA", located in the upstream of the distal miRNA-binding site for enhancing miRNA function. Kertesz et al. proposed a modified model including the moderate flank size for miRNA target recognition. But there is no such research in siRNA design yet. In this current study, we set out to discover the factors that hide in context regions outside the target site using a publically available dataset.Our analysis shows that the local AU content flanking the target site between efficient siRNAs and inefficient siRNAs is quite different. Further analysis by using the binomial distribution, we find some AU-rich hexamers positively correlated with the efficiency of siRNAs. The core sequence of AREs "AUUUA" is also present in the positive correlation hexamers. Considering all the factors synthetically, we developed here a novel two-stage method to predict active siRNAs. The algorithm is a feature selection technique combined with Random Forest and Support vector machine. The Pearson correlation coefficient for regression is as high as0.721, compared to0.671,0.668,0.680,0.645, for Biopredsi, i-score, ThermoComposition21and DSIR, respectively. Our analysis also revealed that target accessibility is one of the most important features. Further more, we found the predictive accuracy would been markedly improved when the context features outside the target site were included, suggesting that siRNA-target interaction requires appropriate sequence context not only in the target site but also in a broad region flanking the target site.As a potentially new and crucial layer of biological regulation, Long noncoding RNAs (IncRNAs) have gained widespread attention in recent years. More and more functional and mechanistic themes have begun to emerge. For instance, IncRNAs play a key role in genomic imprinting, X-chromosome inactivation. While IncRNAs are a proven component of epigenetic gene expression modulation, epigenetic regulation of IncRNA itself remains poorly understood. We analyse eleven types of histone modification (H3K4mel、H3K4me2、H3K4me3、H3K9mel、H3K9me3、 H3K27me3、H3K36me3、H3K79me2、H4K20mel、H3K9ac、H3K27ac), one histone variant (H2A.Z) and DNasel Hypersensitivity Site. We observe that histone marks associated with active transcription H3K9ac, H3K27ac, H3K79me2, H3K4me2, H3K4me3and H2A.Z along with the repressive histone mark H3K27me3, H3K9me3and H4K20mel have similar distribution pattern around TSS.
Keywords/Search Tags:siRNA, Random Forest, Support vector machine, Flanking sequence, Long noncoding RNA, Histone modification
PDF Full Text Request
Related items