Font Size: a A A

Rough Set Theory Is Applied Research, Knowledge Discovery In Relational Databases

Posted on:2012-02-05Degree:MasterType:Thesis
Country:ChinaCandidate:X F JinFull Text:PDF
GTID:2208330332486669Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the increasingly mature database technology, the rapid popularization of the data applications, and the rapid development of Internet, the amount of data human accumulated is exponentially growing. Knowledge Discovery in database is a new technology which is developed recently, it handles a great quantity of data in database, find out deeper knowledge, more decision making information. From a lot of incomplete, noise, random and fuzzy data, data mining extracts implicit unknown and valuable knowledge. The data preprocessing is the key step in the whole data mining, is the necessary work before data mining, and it handle attribute reduction and data standardization.Rough set is proposed by Polish mathematician Pawlak in the early 1980s, it is a mathematical tool which deals with the fuzzy and uncertain knowledge. Attribute reduction is the core content of Rough set, and it deletes redundant attributes on condition that keeping classification ability unchanged. The traditional attribute reduction algorithm is based on the main storage. The attribute reduction algorithm based on relational database operation utilizes database operators and SQL, and it is more effectively than traditional algorithm. How carry on the attribute reduction in relational database to get most efficient data mining, is the key of this thesis.This thesis mainly tells data preprocessing using Rough Set and database namely delete redundant attributes, deal with incomplete and noisy data, etc. First, discussing the basic concept and process of data mining, and the development of Rough Set. Second, introducing some basic knowledge of Rough Set and the extension model of Rough Set, analyzing the traditional attribute reduction algorithm, giving some examples, proposing an improved algorithm based on attribute significance. Third, realizing improved algorithm combined extension model, and this algorithm can deal with noisy and incomplete data. Because the part of improved algorithm is also based on main storage, it greatly affects the massive data mining efficiency. Thus we use database operation in algorithm which is based on main storage. Fourth, giving an application which is combined whith electronic commerce customer information systems.
Keywords/Search Tags:Rough set, attribute reduction, Variable Pricision Rough Set, Compatible Relation Rough Set, database
PDF Full Text Request
Related items