Font Size: a A A

The Data Analysis And Mining Of Breast Cancer Microarray

Posted on:2012-04-10Degree:MasterType:Thesis
Country:ChinaCandidate:Z JiangFull Text:PDF
GTID:2154330335487172Subject:Biomedical IT
Abstract/Summary:PDF Full Text Request
Objective: The Breast cancer is a disease that has high morbidity rate and serious hazard to the health of women. The understanding of pathogenesis and mechanism of breast cancer from gene level plays an important role to cancer research. Gene chip technology can detect the gene expression level of tumor tissues and adjacent normal tissues automatically, massively, quickly and easily. Comparing the experimental data using methods of mathematics and computer science can be expected to find out differentially expressed genes and their related genes. At present, a large number of results of microarray experimental data has been published on the Internet, so the data can be shared freely from Internet. This research is designed to do the data analysis and mining of microarray experimental data downloaded from Internet to verify the feasibility of the mining methods and find out differentially expressed genes and their related genes and switch genes associated with breast cancer. The results can provide candidate genes for further study and lay on the foundation of the establishment of gene regulation network.Method: In this study, Significant Analysis of Microarray (SAM) method and Top-Scored Pairs (TSP) method are adopted comprehensively to find out differentially expressed genes between tumor tissues and adjacent normal tissues; the association rule method and collaborative filtering method in data mining technologies are used to select the co-regulation relative genes and switch genes that have common or opposite variation rules. Firstly, search the raw data of microarray experiments from Internet. Secondly, complete necessary pre-process of the data. Then do the analysis and mining of the data using methods of SAM, TSP, association rule and collaborative filtering to find out differentially expressed genes and the related genes.Result: A number of differentially expressed genes and related genes were selected by data analysis and mining of the breast cancer microarray experiments using above methods. Some of them have been reported that have strong correlations with breast cancer by other papers. Several genes that have similar or opposite variation rules or have switch effects were found out, and parts of them can be inquired that they are related genes in biology sense.Conclusion: The initial screening of differentially expressed genes by SAM and TSP methods is effective, because it can discover many differentially expressed genes while maintaining a low false discovery rate. The use of association analysis and collaborative filtering methods to select related genes preliminary is feasible, and the genes found out have co-regulation effects indeed so that they have common variation rules in expression level. The differentially expressed genes and related genes can be used for further study and provide the foundation of the gene regulation network.
Keywords/Search Tags:breast cancer, microarray, significant analysis of microarray, association analysis, collaborative filtering
PDF Full Text Request
Related items