Font Size: a A A

CePa:a New Method To Identify Significant Gene Sets And The Construction Of Online Data Analysis Platform

Posted on:2013-03-21Degree:MasterType:Thesis
Country:ChinaCandidate:K M CaoFull Text:PDF
GTID:2230330371487907Subject:Biochemistry and Molecular Biology
Abstract/Summary:PDF Full Text Request
The development of microarray brings a revolution in the field of biotechnology. It drives the research of cell biology from single gene analysis to the whole genome level. However, the rapid growth of data makes biology researchers mixed. In recent years, people take a variety of data analysis methods to analyze gene chips, from the traditional simple sort to today’s artificial intelligence, the only purpose is to identify the objective laws hidden in these vast amounts of data to reveal hitherto unknown mysteries of organisms. The analysis of microarray data is a systematic process which is the same as the experimentation of microarray needing careful design and operation to obtain believable results. The gene expression microarray data analysis typically include:the preprocessing of microarray data, finding differentially expressed genes, clustering, gene set analysis, transcriptional regulation and gene interaction network analysis. Gene sets can be extracted directly from the chip data to reflect the function of biological systems by the gene set analysis which is effective useful for the biology and microarray technology. We present our two works which are (1) developing a new gene set analysis method to find significant gene sets by the centrality of gene’s topological network---CePa, and (2) establishing a online platform of gene set analysis---CePa online analysis platform.The gene set analysis is easy to use and is widely used in experimental biology studies based on ORA, but analysis of the reliability of the results is not yet satisfactory. To solve this problem, this paper introduced the network structure factors through the pathway level statistics computing and network centrality measure to extend the ORA method, developed a new gene set analysis methods---CePa, used to find a significant change in biological pathways. Experimental microarray data analysis shows that, CePa is more effective than the ORA method to find the biological significance of the pathway.We also provide an online CEPA analysis platform. The users simply fill out a simple form to complete the analysis of the data. CEPA-line data analysis platform is simple and only requires the user to perform simple data entry to complete the analysis of microarray data. CEPA online analysis platform is built jointly by the three modules—client, web server and computing server. The client is responsible for user data input and format verification. The website server-side is responsible for reasonable verification of the user data and submits reasonable data in the form of the task to the computing server. Computing server administrate users’ tasks in the form of queue, and use the queue’s characteristics—first in first out—to computing the user’s task fairly. The same time, the computing server uses parallel computing programming strategy. task is calculated by the form of a multi-process operating simultaneously, greatly improves the computation speed.With the help of excellent drawing, R language can paint the computing result into visual image, providing a good access to the user.
Keywords/Search Tags:microarray, data analysis, gene sets analysis, CePa, CePa onlineanalysis platform
PDF Full Text Request
Related items