Font Size: a A A

Algorithms For Gene Microarray Data Analysis

Posted on:2013-10-21Degree:MasterType:Thesis
Country:ChinaCandidate:D C YanFull Text:PDF
GTID:2230330371993489Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Gene Microarray (also referred to as Gene chips) can be used in measuring gene expression levels in different developmental stages, different body tissues, different clinical conditions and different organisms, etc. Gene chips are now bringing a great revolution in the fields of life science research, disease diagnosis, new drug development and food hygiene supervision. Meanwhile, the massive Gene Microarray data also bring great challenges to the traditional techniques of information processing. Therefore, this thesis focuses on developing algorithms for pre-processing and biclustering of the Gene Microarray data. The contributions of this thesis are as follows:Firstly, an improved version of the robust Lowess normalization is proposed for the normalization of the Gene Micraoarray data. In this algorithm, the data are firstly smoothed with the locally weighted linear regression method, then the error is further reduced by estimating the residue in smoothing estimation in a framework of kernel estimation, finally, scaling operation is performed with respect to each data point on the grid. Experimental results show both effectiveness and efficiency of this algorithm.Secondly, a novel strategy for estimating the missing data in the gene expression matrix is presented. The algorithm is based on the James-Stein and kernel estimation principles where the estimation matrix is obtained with the k-means algorithm. Experimental results show that our algorithm is superior to conventional algorithm under lower data missing rate.Thirdly, an improved version of the fuzzy spectral biclustering algorithm is presented. Although fuzzy spectral biclustering algorithm performs well, disadvantages, such as the susceptibilities to data types and also the local searching capability, of the FCM algorithm involved in it prevent it from wide applications. Therefore, both the GG algorithm and the genetic algorithm are used to improve the fuzzy spectral biclustering algorithm. Experimental results show excellent performance of our algorithm.Finally, a new biclustering of gene expression data is proposed which based on related genes and conditions extraction. We propose to remove those genes or conditions with little contributions to the considered biclusters by computing the extent of relativity between the genes(conditions) and the biclusters with a novel measure defined based on the consine of the angle between the gene (condition) vector and a vector with all l’s before the biclustering process. After this process, biclustering can be performed only in the data set composed of the above extracted genes and conditions and hence the computing complexity of the algorithm can be reduced. Experimental results show excellent performance of our algorithm.
Keywords/Search Tags:Gene Microarray, normalization, James-Stein estimation, kernel estimation, biclustering
PDF Full Text Request
Related items