Data Mining Method Research Based On Rough Set Theory

Posted on:2009-12-28

Degree:Master

Type:Thesis

Country:China

Candidate:P Fu

Full Text:PDF

GTID:2178360242492779

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

With the development of the Database Technology and Internet, Data Mining attracts great attention in information industry. The major reason is that large amount of existing data may be used widely, and it is urgently necessary to convert these data into data into useful information and knowledge.Traditional information processing techniques are now not adaptive to practical applications. People need more powerful and more efficient information processing techniques, which can discover interesting knowledge from massive information, to guide to making decisions. The theory of Rough Sets that was put forward by a Polish mathematician named Z.pawlak in 1982 is a new tool for processing vague and underfined knowledge. In the whole process of Data Mining, Rough Sets is applied in the aspect of preprocessing of Data Mining. From this point and with the theories of Rough Sets in Data Mining to prepare the process of step for clues, this thesis studies a few problems of the theory. The following are some main points discussed by the thesis:(1) The problem of continuous arrtibutes discretizationRough Sets can deal with the discrete attributes outstandingly; however, it can't process the continuous attributes. Thus we need to change the continuity into the discretization in the practical application in Data Mining. This thesis approaches a discretization method of continuous attributes based on GA.And this method can avoid obtaining locally optimized results when using discretization method based on support and import.The experiment proves that this algorithms looks after both the overall situation and accuracy in discretization attributes.(2) Attributes reductionAnalysis the shortage of the current reduction algorithm, this thesis approaches based on information entropy core sets genetic attributes reduction algorithm. Introducing information entropy theory to preprocess the reduction data, In the process of reduction, it can enhance the convergence speed of algorithm and advance the reduction efficiency. The experiment proves that it is the good algorithms, which can get the best reduction in information system.(3) Rules pick-upIn the current utility, the data in database is always incremental.Therefore incremental reduction of rules is a topic of general interest in the field of knowledge discovery.In this thesis, an incremental learning method based on rough set theory and decision trees techniques is proposed.Then it is compared with classical and RRIA algorithm.The results show the method and effect of the algorithm are better.

Keywords/Search Tags:

Data Mining, Rough Set, Discretization of Continuous Attributes, Attribute Reduction, Rules Pick-up

PDF Full Text Request

Related items

1	Discretization Of Continuous Attributes In Information Systems And Rules Extraction
2	Research On Rough Set Theory Based Data Mining Algorithm
3	Research On Continuous Attributes Discretization And Rules Extracting Basesd Rough Set
4	VPRS Based Approaches For Discretization Of Continuous Attributes And Data Preprocessing
5	Researched For Continuous Valued Attribute Reduction Algorithm Based On Rough Theory
6	The Study On Approaches Of Mining Classification Rules Based On Rough Sets Theory And Intelligent Computing
7	A Study For Discretization Of Real Value Attributes Base On Rough Se Theory
8	Research And Application Of Classification Algorithm Based On Rough Set
9	The Research On The Application Of Rough Set Theory In The Web Information Filtering
10	Research On Discretization Methods For Quantitative Attributes