Font Size: a A A

Studies On The Biclustering Algorithms For Gene Microarray Data

Posted on:2009-11-18Degree:MasterType:Thesis
Country:ChinaCandidate:J ZhangFull Text:PDF
GTID:2178360245963591Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Gene chips are high density probe arrays composed of large amount of DNA or oligonucleotide probes. The probes on the chip hybridize with the fluorescently marked target samples. The gene expression data can be obtained by using the special chips detection systems and with the help of some software. The applications of such a technology are in measuring gene expression levels in different developmental stages, different body tissues, different clinical conditions and different organisms, etc. Gene chips are now bringing a great revolution in the fields of life science research, disease diagnosis, new drug development and food hygiene supervision.The contributions of this thesis are as follows:Firstly, a novel neural-network approach is proposed for the plaid model where both the binary and continuous variables are contained and the traditionally used optimization methods for problems with only continuous variables cannot be employed for it. This method was applied to the yeast data. Experimental results show that the accuracy of the biclustering can be significantly improved with the proposed algorithm.Secondly, an improved non-negative matrix factorization (NMF) algorithm was proposed. In our improved algorithm, a data smoothing process was introduced in the iteration of the NMF algorithm to resolve the dithering problem. The improved NMF algorithm was applied in the analysis of leukaemia microarray data. Experimental results show that the accuracy can be significantly improved with the proposed algorithm.Thirdly, an improved NNMF (New Non-negative Matrix Factorization) algorithm is introduced for the biclustering of the microarray data. Data smoothing process was also introduced in the iteration of the NNMF algorithm to strengthen the connectivity of the elements. To the knowledge of the author, it is the first time to employ the NNMF algorithm in bioinformatics. Experiments on the leukaemia microarray data was implemented in this thesis. The results show that both the accuracy and the speed of convergence are significantly improved.
Keywords/Search Tags:gene expression data analysis, biclustering, plaid model, neural network, non-negative matrix factorization, data smoothing
PDF Full Text Request
Related items