Font Size: a A A

Application And Research Of Large Database Mining Based On Rough Set And Genetic Algorithm

Posted on:2008-12-27Degree:MasterType:Thesis
Country:ChinaCandidate:Y J ZhangFull Text:PDF
GTID:2178360242958952Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Data mining is a process that people extract unknown but useful information and knowledge from data which are vast,incomplete, blurry,stochastic stored in databases,warehousees or other information repositories.Rough Set (RS) theory was put forward by pawlak Zdislaw in 1982. After about twenty years' development,it has received fruitful achievements on both theory and application. RS doesn't depend on additional information beyond the data set,and it is a potent tool for dealing with vague,imprecise, incomplete and uncertain data,and it is also a new technology in data mining. RS theory is mostly used in knowledge reduction and analysis of knowledge dependency,and also widely used in medical diagnosis,pattern recognition, expert system,machine study and data mining.Genetic algorithm (GA) adopts searching method based on random theory. It's searching process begins from a group of original nodes,not begin with a singal node. This mechanism means searching process can jump out of local extremum,and not only get the most accurate value around extremum,but also can explore in the whole question area,so the probability of getting most accurate value is greatly improved.The character that rough set theory can class knowledge and genetic algorithm's evolution theory about extracting best rules from large table are applied in this paper, and a new model of data mining is introduced. The system includes data foreclosing,data dispersing,knowledge reduction,ruler extraction—the basic process of data mining. Because of many fields and redundance information in large table the paper adopts rough set to process,after data foreclosing and data dispersing the conditional fields are to be reduced. Field reduction is a core step in data mining,the reduction makes use of rough algorithm through judging if a table is consistent to work; Reduction is not enough to meet the need of data mining in large table,the large number of rulers must be selected. The selection process applies genetic algorithm to work,through selection,intersection,variance the bset rules come out from large table. About construction of the system vc++ tools and sqlserve database are adopted to build the data mining system based on rough set theory and genetic algorithm as core model algorithm.Finally the paper introduces a example of the model that is used in PHS short message system in the Taiyuan network communications corporation,extracts rulers about if the short messages can be sending and receiving succefully. Through validation the result shows that the system is reliable,and the result helps to administrator to analyse the reason of questions. The model of query and analysis of short messages has been installed in monitor and runs over a year,has find many problems and saves a lot of money. The result proves it helps to enhance efficiency of system and improve running quality of network,and it is also a helpful research about multimode on data mining.
Keywords/Search Tags:Rough Sets, Data Mining, Attribute Reduction, Genetic Algorithms
PDF Full Text Request
Related items