Font Size: a A A

Study On Data Preprocessing Based On Rough Set And Its Application

Posted on:2004-02-18Degree:MasterType:Thesis
Country:ChinaCandidate:C Y LiuFull Text:PDF
GTID:2168360095956637Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the rapid development of databases techniques and computer network, large amount of data is stored, the rapid growth demand for extracting ,understanding and assimilating useful knowledge from the growing mountains of data outspaces the traditional methods of data analysis, which leads to the emerging field of knowledge discovery in databases and data mining. Rough set theory is new maths tools ,and its characteristic have no need of other existing information, which make it overcome shortcoming of other methods and avoid the influence of subjective factor to the results of datamining. and become one of primary methods of KDD.Because data preprocessing have an important influence on KDD and Rough set, Solving these problems efficiently can improve efficiency, exactitude and availability of pattern in application of rough set.In this thesis, we study and discuss deeply data preprocessing based on rough set.First, the characteristic and shortcoming of primary algorithms about the computation of null values is analysised , and principle and target of the completation is indicated. According to the shortcoming of data completation algorithms based on rough set, a strategy for data completation based on valued similarity relation , and a strategy based on the limited similarity relation are put forward to improve effect of completation. Moreover, a strategy for imputation of null values based on changed precision model is put forward to improve anti-disturbing ability of modelSecondly, several primary algorithms are indrouced and discussed, the direction and target of discretization is analysised , To gain logical cuts, a method to discretize continuous attributes based on Rough Entropy is brought forwardAt last, the potential client datamining system in electronic commerce is put forward, All given algorithms is applied to the system, and compared by test results, and the new idea of this paper is simply described and some problems in this paper that need to improved on are proposed...
Keywords/Search Tags:Rough Set, Similarity Relation, Data completation, Discretization, Rough entropy
PDF Full Text Request
Related items