Font Size: a A A

Study On Some Key Problems In SiRNA Design

Posted on:2014-03-18Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y P ChangFull Text:PDF
GTID:1260330425465884Subject:Bioinformatics
Abstract/Summary:PDF Full Text Request
RNA interference is making intra-cellular homology mRNA degradation byimport short double strand RNA, can inhibit expression of target mRNA. An effectiveapproach for RNA interference is through small interference RNA(siRNA) design, thequality of siRNA can influence the effect of RNA interference directly, thererfore,effective siRNA design method is crucial. Design siRNA by biological experimentrequires a lot of manpower and resources, high cost of experiments, long cycle, andlow efficiency, thus by bioinformatics and computer-aided means to design siRNA hasbecome effective means of achieving RNA interference.There are some problems in the design rule of siRNA design, at present the designrule is based on sequence feature, have not consider secondary structure of target, thusthe efficiency of designed siRNA is low.There are some problems in the prediction of candidate siRNA efficiency, atpresent predict of candidate siRNA efficiency are based on siRNA sequence features,the accuracy is low, the correlation coefficient is around0.63, thus which leads toexcessive number of candidate siRNA sequences, brings some difficulties to biologicalexperiments. How to improve accuracy of siRNA efficiency prediction is an urgentproblem.H1N1influenza virus is an RNA virus, it has strong infectivity and fast spreadvelocity, brings serious threat to human health. Now, the main method used to preventand treat flu is by vaccination and medication, the vaccine only can used to prevent fluand only for matched strains, when new flu outbreak, can not get correspondingvaccine timely, and can not guarantee the safety of the vaccine. Anti-influenza drugmainly are M2ion channel blocker and neuraminidase inhibitor, because after used ofthe former drug, can cause drug-resistant strains rapidly, thus the clinical application islimited;the price of the latter is expensive, ordinary people can not bear it and theproduction capacity of the drug is limited, if there is a large scale epidemic, thensupply of the drug is limited, we should pay more attention to that with the widely useof the drug, drug resistance is also steady spread and drug has some side effects on central system and digestive system. A Influenza virus brings serious threat to humanhealth, using the traditional method can not control new influenza virus timely andeffectively, thus researchers should consider various aspects of influenza virusinfection mechanism, look for effective method to prevent and treat influenza virus.By bioinformatics methods to analyze A H1N1influenza virus, using RNAinterference method to inhibit expression of virus gene, can control the spread of virus,compared with using the traditional experiment method to study H1N1influenza virus,this can reduce cost and shorten research cycle. RNA interference has become effectiveinstrument of inhibiting A influenza virus. The researchers according to traditionalsiRNA design method, designing siRNA which targeting to H1N1influenza virus toinhibit expression of the H1N1influenza virus gene, has got some achievements. Butat present siRNA design methods mainly are based on sequence features, have notconsidered influence of target structure on siRNA interference efficacy, thus designedsiRNA interference efficacy is low.Secondary structure of target mRNA is related to siRNA inhibitory efficacy, thuswhen designing effective siRNA, consider structure feature of target mRNA mayimprove accuracy. This study proposes a siRNA design algorithm which combinedsequence features and structure features, then apply it to design siRNA of2009H1N1influenza viral and2008seasonal H1N1influenza viral.Every H1N1influenza viral strain contains8gene fragments, namely PB2, PB1,PA, HA, NP, NA, MP, and NS, HA gene and NA gene are likely to mutation, while NP,MP, PA and PB1gene are relatively conservative, thus target gene of RNA interferencemainly are NP, MP, PA and PB1gene. The PA fragment has polymerase activity and isinvolved in the entire process of transcription and replication of the virus, play the roleof kinase or helicase, hence, it is a good target in the prevention and treatment ofH1N1flu, designing efficient siRNA to inhibit the expression of PA gene, can controlthe spread of H1N1influenza viral. In this study, the PA fragments of the H1N1influenza virus in2009and the seasonal influenza virus in2008of sequence andstructure are compared and analyzed, and found significant differences between them,not only in sequence features, but also in RNA secondary structures, which lead todifferent biological nature. This paper proposes a siRNA design algorithm whichcombined sequence features and structure features, when designing siRNA of H1N1influenza virus, not only considering sequence features, but also structure features, using structure coefficient to evaluate secondary structure of target, select the bettercandidate target and then according to target design corresponding siRNA sequence.On the basis of improved siRNA design algorithm, design siRNA of2009H1N1influenza virus and2008seasonal H1N1influenza virus respectively, and find that antarget which only have one base difference between2009H1N1influenza virus and2008seasonal H1N1influenza virus, which lay the foundation of finding mutualtarget.If researchers can find features which closely related to siRNA interferenceefficacy, then can improve the accuracy of prediction. This study proposes consideringmRNA global features and near siRNA binding site local features except siRNAfeatures, when predicting siRNA efficacy, considering20nucleotides at each side ofthe binding sequence, together with21nt at the siRNA binding region,61nt in all,named neighboring nucleotides. From the result of qualitative analysis, it can be seenthat the more the siRNA interference efficacy, the less the mRNA GC content, mRNAstem ratio, neighboring GC content, neighboring stem ratio. The qualitative analysisonly can see the tendency, but can not quantitative assessment, then do linearregression analysis, and find that there are strong correlation between the siRNAinhibitory efficacy and the average of the mRNA GC content, mRNA stem ratio,neighboring GC content, neighboring stem ratio, and the P-value is very significant.From the result of qualitative and quantitative analysis, it can be seen that there arestrong correlation between mRNA GC content, mRNA secondary structure feature andRNA interference efficacy, on the mRNA global level and neighboring location. Fromthe result of feature selection, it can be seen that some mRNA features and neighboringfeatures are important feature, and the number of important mRNA feature are muchmore than the number of important siRNA feature, thus when predicting siRNAinterference efficacy, should consider mRNA global feature and neighboring localfeature.Based on the above analysis, this study proposes a siRNA efficacy predictionmodel based on random forest using siRNA features, mRNA features, and near siRNAbinding site features, the correlation coefficient of10fold cross validation increasedfrom0.63to0.7, which confirmed that considering mRNA global feature andneighboring local feature can improve accuracy, therefore, when designing siRNA,should consider the influence of mRNA global features and near siRNA binding site local features on siRNA interference efficacy except siRNA features. The studysuggests that when designing effective siRNA target to mammal which have less GCcontent, fewer stem secondary structures, in other words, more loop secondarystructures of mRNA at both global and local flanking regions of the siRNA bindingsites are preferred,mRNA GC content and neighboring GC content less than50%arepreferred; mRNA stem ratio and neighboring stem ratio less than0.6are preferred. Thestudy provides a new idea for siRNA design, and directive significance to designeffective siRNA. In addition, the result of this study may also be helpful inunderstanding binding efficacy between microRNA and mRNA, it is because there aresome similarities between siRNA binding to mRNA and microRNA binding to mRNA.In summary, there are two innovation points in this paper:1、This study proposes a siRNA design algorithm of multi-feature fusion, whichtarget to influenza viral, according to theory and practice of pattern recognition,multi-feature fusion is effective means of improve recognition accuracy. Bymulti-feature(sequence feature and secondary structure) fusion method to designsiRNA which target to influenza viral is one means of improve accuracy.2、This study proposes a siRNA efficacy prediction model based on random forest,when predicting siRNA efficacy, consider the influence of mRNA global features andnear siRNA binding site local features on siRNA interference efficacy except siRNAfeatures. The correlation coefficient of10fold cross validation increased from0.63to0.7, which confirmed that considering mRNA global feature and neighboring localfeature can improve accuracy.For the future research, we will consider other features which related to siRNAinhibitory efficacy, mainly consider protein features. Protein binding features caninfluence siRNA inhibitory efficacy, it is because if there are proteins have bound ontarget, then siRNA difficult to bind on target, thus influence siRNA inhibitory efficacy.
Keywords/Search Tags:Bioinformatics, H1N1influenza virus, RNA interference, Small interference RNA, SiRNA design, Structure coefficient
PDF Full Text Request
Related items