Font Size: a A A

Research On Data Mining Methods Based On Rough Set

Posted on:2005-10-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:G C CuiFull Text:PDF
GTID:1118360152956683Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Data mining attracts great attention in information industry. The major reason is that large amount of existing data may be used widely, and it is urgently necessary to convert these data into useful information and knowledge.Rough set theory is very important for artificial intelligence and cognitive science. It is emphasized and highly appraised by Zadeh, the founder of fuzzy mathematics, since it appeared. And also it is listed into the basic theory of soft computing which Zadeh newly advocated. Applying rough theory in data mining field can improve the analyzing and learning ability for incomplete data of large database, which has extensive applied prospect and applied value. Attribute reduction is a significant topic of rough set theory. Large database usually involves many attributes that are redundant or unnecessary for discovering rules. The researcher found that it could improve potential knowledge definition of system, lower time complexity of discovering the rules, and raise discovering efficiency if the redundant attributes could be eliminated.For mass data given in large database, it is advisable to update the data mining results incrementally, rather than mine from updated results of each time. Incremental algorithm combines with the database updating together,which is not necessary to mine all the data over again. This algorithm updates the knowledge incrementally, modifies and strengthens the discovered knowledge. Incremental algorithm is one of the major algorithms for raising learning efficiency. Applying incremental algorithm into data mining can reduce the complexity and enhance instance-revising rules. Data mining based on rough set and genetic algorithm is investigated in this paper for solving the problems mentioned above:It studies the principles and actuality of data mining, and integrates the research achievements in database, artificial intelligence, statistics, pattern recognition, machine learning, and data analysis .etc fields for data mining. This paper discusses the corresponding concepts, working steps and key technologies on data mining from the point of view of data mining and knowledge classification. Data mining (DM) is a nontrivial process of identifying hidden, undiscovered, and potentially useful knowledge from mass original data. In brief, it is a process transforming data to knowledge. Database is divided into two parts in data mining system, training set and testing set. A learning process is produced and corresponding knowledge model is achieved with training set. The major working steps include: data preparation, practical mining and rule description, based on which compare data-mining with knowledge discovery and online analysis, and indicate that data mining is a process to mine interest knowledge from mass data in databases, data warehouses and other information databases. From the point of view of data analysis, OLAP locates in the shallow layer and DM in the deeper. The main methods of data mining include: decision tree, neural network, fuzzy theory, genetic algorithm and rough set etc. Data mining system model is achieved by the summing-up of data mining methods. It analyzes and investigates thoroughly the essential theory of rough set and genetic algorithm, basic methods and algorithms of attribute reduction. Rough set theory is a new mathematic tool for processing vagueness and uncertainty knowledge. The main thinking is to deduce the decisions or classification rules of problems by knowledge reduction premised on keeping classification ability constantly. The core of rough set is knowledge reduction and discovery, which is supported by a series of algorithms, such as equivalent relation, upper/lower approximation solution, attribute importance estimation, core computation and attribute reduction, etc. Within the algorithms, attribute reduction is the main method for data analysis in rough set, so the design and realization of reduction algorithm is one of the most important contents in rough set research. This paper discusse...
Keywords/Search Tags:Research
PDF Full Text Request
Related items