Font Size: a A A

Rough Set Attribute Reduction Algorithm Based On GA And MapPreduce

Posted on:2015-02-21Degree:MasterType:Thesis
Country:ChinaCandidate:Y N FuFull Text:PDF
GTID:2268330428964085Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The era of big data has come, following the rapid development of Internet age and the emergence of new technologies, such as internet of things, cloud computing and so on. Data mining has been an important research topic around the whole world, for huge ecnomic, research and society value in big data. The theory of Rough Set proposed by Polish mathematician Pawlak.Z is data inference tool. This theory tool is very useful to research vagueness, uncertainty, learning, induction and so on, especially in knowledge classification and knowledge finding. Attribute reduction is one of rough set theory’s core problems. There have been much research about attribute reduction. Many algorithms have been proposed, most of them are heuristic attribute reduction, which based on some attributes’ importance to add or remove other attributes. However, this kind of algorithms can’t solve some information system. So some researchers have used genetic algorithm for attribute reduction, which can solve what the heuristic attribute reduction algorithm can’t solve, including what they can. But attribute reduction based on genetic algorithm may be prone to "premature local convergence" phenomenon, because the genetic algorithm itself has the defect of probable "premature local convergence".To solve the problem mentioned above, this paper proposed attribute reduction based on genetic algorithm and MapReduce. This algorithm aims to parallel multi-populations on MapReduce, which is a simple but strong distributed parallel processing system.This algorithm not only retains the advantiage of biological intelligent algorithm, solving what the classic algorithms can’t solve, but also solve the problem of "probable premature local convergence" in genetic algorithm. This paper’s main research work is listed as follows:Firstly, researched and introduced the concepts of rough set and genetic algorithm, in addition to, their design philosophy, algorithm steps, working principle and so on. Then systematically researched an intelligent attribute reduction algorithm which is based on genetic algorithm. On the base of these classica theory and algorithm model, introduced the new MapReduce theory and its implemented platform Hadoop, then proposed the attribute reduction based on genetic algorithm and MapReduce. This algorithm’s main design idea is parallel multi-populations evolution on the base of genetic attribute reduction algorithm. The classical genetic algorithm evolute with sole population, which has the defect of "probable premature local convergence". While, the classical genetic algorithm evolute with multi-populations will solve the problem of "probable premature local convergence", which will be proved by Probability and Statistics Theory. And the parallel platform will save much time. Based on such research and thinking, this paper design the experiment of this algorithm, and introduced how to parallel in detail. The experimental results show that this algorithm is better than the classic one,with a better accuracy and achieving the attribute reduction more effectively.
Keywords/Search Tags:Rough Sets, Attribute Reduction, Genetic Algorithm, MapReduce, Parallel
PDF Full Text Request
Related items