Font Size: a A A

The Research On Data Preprocessing Based On Rough Sets Theory

Posted on:2009-05-27Degree:MasterType:Thesis
Country:ChinaCandidate:Y F LiFull Text:PDF
GTID:2178360245486490Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of databases techniques and computer network , large amount of data is stored , the rapid growth demand for extracting,understanding and assimilating useful knowledge from the growing mountains of data outspaces the traditional methods of data analysis,which leads to the emerging field of knowledge discovery in databases and data mining.Rough set theory is a new maths tool,and its characteristic have no need of other existing information,which make it overcome shortcoming of other methods and avoid the influence of subjective factor to the results of data mining,and become one of primary methods of KDD.Because data preprocessing have an important influence on KDD and Rough set,solving these problems efficiently can improve efficiency,exactitude and availability of pattern in application of rough set.In this thesis,we study and discuss deeply data preprocessing based on rough set.Firstly,an overview of the current situation of researches on Rough Set,and the main issues related to the incomplete data problem and the commonly-used methods of handling incomplete data problems are detailed.Based on the theory,the characteristic and shortcoming of primary algorithms about the completation of null values is analysised,and principle and target of the completation is indicated.According to the shortcoming of data completation algorithms based on rough set,a strategy for data completation based on valued similarity relation are put forward to improve effect of completation.Secondly,several primary algorithms are indrouced and discussed,the direction and target of discretization is analysised,To gain logical cuts,a method to discretize continuous attributes based on Rough Entropy is brought forward.At Last,the new idea of this paper is simply described and some problems in this paper that need to be improved on are proposed.
Keywords/Search Tags:rough set, data mining, data completation, attribute discretization, rough entropy
PDF Full Text Request
Related items