Font Size: a A A

The Research Of Mining Algorithms About Association Rules Based On Relational Database

Posted on:2007-02-01Degree:MasterType:Thesis
Country:ChinaCandidate:Q ZhangFull Text:PDF
GTID:2178360185951584Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With growth of database technology, popularity of network technology and updating of computer hardware, the capability of collecting data was improved rapidly. Hence, the capacity of storing data was enlarged hugely all over the world. To improve the understanding of vast data, data mining technology has improved rapidly. Relational database is an important form to store lots of information of production, management and scientific research. The increase of data quantum is very fast. Studying the efficient technology of mining association rules has a wide development in future.Mining association rules is one of important parts of data mining, which is advanced by Agrawal and the other in 1993. First the purpose is analyzing the relation of items in transaction database. Later, because investigator improved and extended the prototype of question. Mining association rules has been an active research area of data mining. Mining association rules can usually decompose two steps: (1) Generate all itemsets whose support are at least bigger than a given minimum support, which are referred to frequent itemsets; (2) Extract all rules from the frequent itemsets. But the most important step is the frequent itemsets generationThis paper analyzing rules in algorithm of typical Boolean association rules in transaction database. It is conclusion that the mining algorithm of the frequent itemsets generation in relational database. It is core of the algorithm that relational database is operated with the gather selection and link sentence in SQL language, in order to complete the selecting course of frequent predicate set and efficient rules. Because it is efficient that relational database is operated with the SQL language, and algorithm combines the database management system closely, the algorithm is of the mining efficiency.About the rules generation, most of the existing work has focused on mining positive association rules. In fact, it is equally important to mine negative association rules. To fill the completeness of data relation, we need negative association rules. Furthermore, one of the important problems in association rules mining is how to measure the uncertainty of the association rules. One of the most popular models for mining association rules is support-confidence model, which uses two values: sup(Xâ†'Y) and conf(Xâ†'Y)as the measurement of uncertainty of association rules. However, it is possible to extract association rule such as Xâ†'Y, but X and Y are independent. This means that conf(Xâ†'Y)is insufficient for measuring association rules of interest. The PR_NR algorithm is presented based on the Correlation Coefficient theory of Statistics.It could mine positive and negative association rules. Experiment results demonstrate the algorithm is efficient.
Keywords/Search Tags:Data Mining, Relational Database, Association Rules, SQL Language, Negative Association Rules, Correlation Coefficient
PDF Full Text Request
Related items