Font Size: a A A

The Research On Association Rules Mining Algorithms Of Gene Expression Data

Posted on:2007-06-16Degree:MasterType:Thesis
Country:ChinaCandidate:Y WangFull Text:PDF
GTID:2178360185465735Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
As the focus of Human Genome Project turns to functional genomics, analyzing the gene expression data is one of the hot problems in bioinformatics sciences. Association rule mining is one of the important methods for analysing the gene expression data.Association rules can reveal biologically relevant associations between different genes or between environmental effects and gene expression, and then help to identify disease gene. In this paper, the algorithm of association rules mining in the expression data is studied.Apriori is the classical arithmetic of assciation rule mining. Based on the principle of Apriori, we presents a conception which called absolutely join,and give an effective qualification of absolutely join, in which the candidate 4k-itemsets was built directly with absolutely join while create the candidate (2k+1)-itemsets from the muster of frequent 2k-itemsets; and only use the absolutely join for the muster of frequent (2k+1)-itemsets to create the candidate (4k+2)-itemsets. This algorithm decreases the times of iteration and the compare. The experiment results show that no frequent itemsets is missed and what's more the speed of the mining is effectively improved in this algorithm.When we use the traditionary algorithm such as Apriori to mining gene expression data, the gene expression matrix should first change into the boolean matrix, and then transformed the data into the form of business data based on the boolean matrix, which ignored the characteristic of the gene expression data and. According to those flaws, this paper proposes an"and operation"algorithm which mining rules directly on boolean matrix and won't creat candidat itemsets, further more , through operation in subsection to improve the efficiency of"and operation"algorithm. The experiment results show that the algorithm can mine frequent item sets more effectively and faster.At present, the method for association rule mining of gene expression data is turn it to Boolean association rules mining, it ignored the characteristic that the expression data is numerical value. So,based on this point, this paper introduce the fuzzy method to mining quantity association rules of gene expression data, and though using the fuzzy C-Means replace membership function to partition each gene to three musters based on the expression value, which insure the associate of gene is accurate.
Keywords/Search Tags:gene expression data, association rules, Apriori, And operation, mining fuzzy association rules
PDF Full Text Request
Related items