Font Size: a A A

Incremental And Decremental Maintenance Of Frequent Patterns In Dynamic Databases

Posted on:2007-06-25Degree:MasterType:Thesis
Country:ChinaCandidate:J L ZhangFull Text:PDF
GTID:2178360212973181Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In the past two decades, with the rapid development of the database technologies and the extensive applications of the DBMS, people's ability of collecting data has improved greatly, resulting in a huge volume of data. Generally speaking, many important patterns or interesting knowledge are hided behind the tremendous amount of data. Consequently, people hope to execute a high-level analysis on the collected data in order to find out useful knowledge, which can enable them to better utilize these data and provide the decision-makers with powerful supports. Unfortunately, the traditional statistical methods cannot meet the demands at present. This results in the emergence of the data mining (DM) techniques.Data mining, also known as Knowledge Discovery in Databases (KDD), is a process that extracts potential, valid, novel, useful, ultimately understandable and applicable knowledge from vast, incomplete and noisy data. It is an intersected subject that involves the database system, computation theory, artificial intelligent, statistical theory and the cognize science, which can perform association analysis, classification, clustering, forecasting, outlier detection and evolution analysis on the data. Though the data mining techniques has a short history, but it indeed absorbs great research enthusiasms and wide attentions from many researchers as well as the industrial experts all over the world, due to its enchantment and broad application prospect.The association analysis is one of the most important research topics in the data mining field. In 1993, Agrawal et al first proposed the problem of mining association rules from customer transactional databases. Since then, many researchers have been done extensive studies on the problem of mining association rules; their works includes optimizing the existing algorithms (for example adopting the random sampling or parallel techniques in order to improve the effectiveness of the mining algorithms) and the application of the association rules. Generally speaking, the main task of association analysis is to find out the frequent patterns from databases, because the generation of the association rules from frequent patterns is nothing but a simple computation problem.Motion is an eternal truth in nature. Our collected data is also continuously changing. As a consequence, the knowledge that we obtained from data by using DM techniques needs to be modified accordingly, in order to reflect the new trends of the dynamic databases. The incremental algorithm is one that can process the newly added data to meet the need of modification and enhancement of the obsolete knowledge, which avoids executing the mining process on the whole database. In 1989, Paul Utgoff et al proposed the incremental decision tree algorithm ID5R, which can enable the famous decision tree inducer ID3 to change the whole tree structure based on the dynamically added instances. While...
Keywords/Search Tags:association rules, incremental mining, decremental mining, itemsets distribution, Zipf distribution, K-S test
PDF Full Text Request
Related items