Font Size: a A A

Algorithm And Implementation Of Data Mining Based On Rough Set

Posted on:2009-04-25Degree:MasterType:Thesis
Country:ChinaCandidate:D Y ChenFull Text:PDF
GTID:2208360245461307Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Data Mining is a process that finding hidden and potentially useful information from large scale of data. It is the critical process of Knowledge Discovery in Database, and has become a popular research field.Rough Set methodology is one of the useful tools of Data Mining. Rough Set which has the ability to deal with imprecise, uncertain, and vague information is a new valid mathematical theory developed in recent years. It can find valid and potentially useful knowledge in large amount of data. And because of that, Rough Set has become increasingly popular and has been successfully applied in fields such as Machine learning, Data Mining, and Intelligent Data Analyzing.This thesis which is based on the ShiChuan tacke hard-nut problems in science and technology program named Telecommunication Business Intelligent Data-Mining Engine Research mainly studies some critical technology problems in Data Mining based on Rough Set theory. The thesis is divided into four sections as follows:1. In the beginning, the concepts, background, research contents, main methods and hotspots of data mining technique are introduced. The development process of Rough Set is reviewed. And the preliminary knowledge and present research status on Rough Set is introduced in details.2. The second section of this thesis mainly focuses on the reduction algorithm which consists of attribute reduction and attribute value reduction. Attribute reduction algorithm is a key role for the model of Data Mining based on Rough Set theory. Basic attribute reduction and attribute value reduction algorithms are introduced in this section. As to the deficiency of the several prevalence algorithms, this thesis brings forth relevant improved algorithms, then analyses compare the improved algorithms with the prevalence algorithms based on experiments. These improvement and comparison algorithms include heuristic MIBARK algorithm and MIBARK-NC algorithm, HORAFA algorithm and improved HORAFA algorithm.3. Incremental reduction algorithm which is a dynamic reduction algorithm dealing with dynamic data is put forward in the third section. This section is mainly about the incremental Data Mining technology and the principle of the incremental reduction algorithm. And based on the principle, the expansion definition of feature matrix is given and a incremental reduction algorithm based on the feature matrix is put forward. After that, ASRAI algorithm is introduced and an improving algorithm of ASRAI algorithm is presented.4. E-mail filtering model based Rough Set theory is put forward as an application of the Rough Set reduction algorithms. A real-time personal user E-mail filtering system based on Rough Set theory is designed. After that the system is tested with experiments.
Keywords/Search Tags:Data Mining, Rough Set, Attribute Reduction, Incremental Attribute Reduction
PDF Full Text Request
Related items