Font Size: a A A

The Analysis Of Network Topology Structure Of Biological Data

Posted on:2015-03-31Degree:MasterType:Thesis
Country:ChinaCandidate:H ZhaoFull Text:PDF
GTID:2308330473952769Subject:Biophysics
Abstract/Summary:PDF Full Text Request
The high dimensionality and complexity of data is one of the major challenges faced by the current biological data analysis. The advantages and main problems of current biological data network topology analysis method are summarized in this paper, and the usefulness of diffusion wavelet analysis to reveal the biological data network topology is explored. By high-throughput microarray gene expression data analysis, we found that this method provides deeper information than other typical analysis technology.Gene chip technology is widely used, which is applied both to a variety of basic problems of biological science and to the diagnosis and classification of cancer and other diseases. Therefore, microarray data is one of the main sources of massive biological data. As the research object, we analyze biological network topology constructed by microarray data using diffusion wavelet method, which is applied to cancer subtype classification. However, microarray data is so large that data needs to be screened reasonably in preprocessing.At first, comparison relationship between normal and tumor samples is established by DSGA method, and dimension reduction through the principal component analysis and effective screening are done.This step will filter the characteristic data that more conform to our requirements and the processing efficiency is improved. Then the result is recorded as the input file for our next step of diffusion wavelet analysis method. Further, we propose a diffusion model based on multi-scale embedded method, and an effective scale function and wavelet transformation is introduced. In a certain scale, we can determine the type we need, the most valuable information is saved and worthless datdiscarded. Finally, a simple image which can present a data network topology vividly is offered as a result by mapping.By processing multiple sets of microarray data of breast cancer and gastric cancer from different database, it is found that gastric cancer can be classified into 3 subtypes. Through analysis of the deeper genetic information of the three subtypes, it can be summed up as proliferation, metabolism and interstitial type. In addition, this study is found that breast cancer can be classified into 10 subtypes by improving the accuracy parameter. This reminds us that breast cancer classification maybe at least 10 subtypes more than 4 as we know before. Consequently, we provide a new method “tailored” for the future treatment of different subtypes of cancer.Although this method can be applied to a wide range of high flux data types, this article is mainly used for analysis of gene chip microarray data, it provides a new method and theoretical basis to extend to the magnetic resonance data analysis of brain network and the further development of life sciences research.
Keywords/Search Tags:Biological Data, Topology Structure, Diffusion Wavelet, Multi-scale Analysis, Gene Chip
PDF Full Text Request
Related items