Font Size: a A A

Incremental Updating Techniques For Association Mining In Food Safety Supervision

Posted on:2009-03-16Degree:MasterType:Thesis
Country:ChinaCandidate:Y W XuFull Text:PDF
GTID:2178360242992086Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
Applying data mining techniques into the field of food safety, would enable an early discovery of the regularities and exceptions of the food database, and also facilitate understanding and fast analysis of the huge frequently updated data set. Association rule mining, which discovers the correlation among attributes of a dataset in a form simple to comprehend, is a fundamental approach to find potential interesting pattern. This paper discusses how to apply association rule mining on food safety database, and techniques to maitain the rules during database updation. All the work could be summed up as follows.1) This paper firstly analyses the features of the food detection dataset, such as multi-dimension, category attribute value, unbalanced distribution and primarily sparsity, while taking into account the major difficulties of the large item-set mining problem. And then shows the relationship between those characteristic and the algorithm complexity.2) An algorithm called Selective Sketching Filter (SSF) is proposed to filter out the redundant information contained in attribute values. This algorithm could provide a much smaller candidate 1-item-sets to association mining algorithms or incremental updating algorithms, so that promotes the efficiency of such algorithms vastly, and would filter out the fake rules caused by redundant attributes, therefore fastens the rule selection process by confining the bound of alternatives.3) For the food safety database into which data is inserted, an algorithm called UWP (Update With Portions) is proposed. The UWP algorithm divides the new large item-sets of the updated database into three sub-item-sets, and deals with each of them separately, without recalculating all the candidate item-sets. Negative Border is also involved in the UWP, to decrease the I/O cost. Experiment is designed and carried out to compare the efficiency of applying UWP as incremental updating method and applying Apriori algorithm to recompute all the new large item-sets. The result shows that the UWP algorithm is not only efficient, but also has a good feasibility and compatibility.
Keywords/Search Tags:data mining, food safety, association rule, frequent item-sets, incremental updation, sparse data set
PDF Full Text Request
Related items