Font Size: a A A

Microarray Data Analysis And Applications In Breast Cancer Data Analysis

Posted on:2007-08-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:X S LvFull Text:PDF
GTID:1118360215995347Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Microarray is a new experiment technique developed in 1990s. It can measure the expression levels of thousands of genes simultaneously which is very different from the traditional biological experiments. Because the data generated by microarray experiments are too large for the traditional biological data analysis methods, many new data analysis methods have been proposed with the development of microarray technology. In this dissertation, we focus on the exsiting problems in probe mapping, outlier detection, feature selection and reconstruction of gene regulatory network, and solve these problems by proposing new methods or system comparison and investigation. After the methods study, we show the application of microarray by analyzing a breast cancer microarray dataset.For the probe mapping problem, we first compare the results of several data analysis methods with different probe-set definitions. Based on the comparison we find that the changed definition affects the results of gene and gene-set based analysis very much, but it has little effect on sample based analysis. In outlier detection process, we propose a recursive outlier detection strategy. This simple strategy does not depend on any specific classification algorithm and achieves satisfying results in simulation and real datasets. Since wrapper methods can not calculate the significance of each feature when performing feature selection, we propose a statistical framework for evaluating the significance of each feature by its ranking. This framework works well with the simulation datasets. In reconstruction of gene regulatory network analysis, we study the effects of Bayesian network learning parameters on the results of network reconstruction. From the comparisons we find that prior knowledge and initial structure will affect the reconstruction results very much, we should select different parameters in different applications.In breast cancer data analysis, we study the relationships between gene expression data and several clinical statuses of breast cancer patient respectively. The clinical statuses include hormone receptors, Bloom-Richardson grade, lymph node metastasis, LVI and tumor size. The data analysis results show that biological features such as hormone receptor and Bloom-Richardson grade can be well predicted by gene expression data, but we can not get good prediction results on anatomical features such as lymph node metastasis, LVI and tumor size. Since lymph node metastasis well correlates with patient outcome, we study its relationship with other clinical statuses and the effects of correlated features on lymph node metastasis.
Keywords/Search Tags:Microarray, data analysis, probe mapping, breast cancer
PDF Full Text Request
Related items