Since RNA was discovered as a key intermediate between genome and proteome,the identification of transcripts and the quantification of gene expression have been unique core activities in molecular biology.RNA’s high-throughput sequencing technology(RNA-Sequencing,RNA-seq)has also become an important tool in the field of life science research.However,the high-throughput,high-dimensional and highnoise characteristics of high-throughput sequencing data lead to the high threshold of combinatorial analysis,which is difficult for researchers without bioinformatics background to process and analyze.At present,there are few integrated transcriptome analysis platforms,complex operation,low visualization quality and lack of interaction,which greatly limits the application of transcriptome in life science,medicine and other fields.Based on the above situation,this study designs and constructs a Castor Cloud highthroughput data analysis platform using R language,python,perl,R Shiny,Html,CSS,Markdown,Java Script and Bootstrap in Linux and Windows environment,and develops local applications and Web Sever websites.It provides a complete analysis process from sequencing off-machine data to including quality control,reference genome alignment,gene expression quantification,Pearson correlation analysis,principal component analysis,differential expression analysis,KEGG annotation and enrichment analysis,GO annotation and enrichment analysis,KEGG pathway mapping analysis,weighted gene co-expression network analysis and customized data visualization.The platform integrates and encapsulates the mainstream algorithm tools,provides a variety of analysis strategies,and renders publication-level visual graphics in real time based on tuning parameters,which has significant advantages over general commercial platforms.The castor plants of Lvbi No.1 and Zibi No.5 were sprayed with normal concentration of herbicide sulfosulfuron,and the samples were sampled and sequenced according to the tissue site during the recovery period after injury.Using the constructed platform for data analysis,gene mining and visualization of the sequenced off-machine data,it was found that Rc GSTU8 and P450 genes were involved in herbicide detoxification regulation,anthocyanin transport and accumulation,and regulated more complex detoxification mechanisms in vacuoles.At the same time,the co-expressed Rc PYL4 gene population regulated castor lateral root growth to enhance water extraction and feeding,and cope with herbicide damage.It was found that metabolic pathways such as phenylalanine and cysteine may mediate the enhancement of castor system resistance,and at the same time,through the reorientation of metabolic flow to ensure the dynamic balance of the pathway,resist stress and maintain normal growth and development.Through the q RT-PCR of candidate genes,the reliability and accuracy of platform analysis are verified.In this study,a RNA-Seq high-throughput sequencing analysis platform with multithread operation,high integrated encapsulation and multi-strategy analysis is established,which provides relevant researchers with customized,easy to operate and high-speed data mining functions,as well as multi-format,high-quality and interactive visual graphics.It greatly saves the learning and operation costs of high-throughput data analysis,improves the weakest supporting links in the high-throughput sequencing process,and provides an effective tool and practical basis for exploring biological processes from the genetic level and revealing the effect mechanism.At the same time,through the practical application of the platform,the mining of key genes was completed,which provided a new perspective for the study of the detoxification mechanism of herbicides in castor,and provided theoretical support for further research and genetic breeding. |