| The world has already entered the Internet ages nowadays, the high-speeddevelopments of the computer and the Internet technology make the data andinformation in every fields increase rapidly (information explosion), and theuncertainty of the data and information system is more obvious because of thehuman's participation. How to dig up the underlying and valuable information(available knowledge), this gives the human an unprecedented challenge in intelligentinformation processing fields. A brand-new artificial intelligence field — DateMine(DM) and Knowledge Discovery in Database (KDD) is coming into being, andthe Rough Set Theory is one technique in that field.In the various investigations of the rough set theory, the reduction of theinformation system and the minimum reduction are not only the basic problems inrough set theory, but also the most pivotal and intractable problems in rough settheory. a majority of traditional rough set techniques are not linked with the database,so it causes the rough set theory hard to fit the practical database size, therefore theinvestigation of the efficient algorithm for attributes reduction based on database isone of the practical problems in this theory.After Summarizing and inheriting the internal and overseas investigateproduction of the rough set theory, linking the relation database and the SQLtechnology, this paper puts forward an algorithm for attributes reduction based onrelation database. Mainly including: â‘ Analyzing and educing the environment thatthe proposition 1 "Card( âˆ(C – Ci + D) ) > Card( âˆ(C -Ci) )" is the full condition ofthe proposition 2 "the attribute Ci is the core attribute", and the special environmentthat proposition 1 is not the full condition of the proposition 2;â‘¡Based on thetraditional core-getting algorithm with the distinguishing matrix, this paper putsforward an algorithm that makes those conflicting objects' decision attribute fuzzy,sequentially the proposition 1 is the full condition of the proposition 2 under anyenvironments, finally, this algorithm and another algorithm that incorporates the sameobjects make up of the pretreatment algorithm;â‘¢Based on the previous foundation,this paper puts forward an core-getting algorithm, this algorithm can get coreattributes exactly;â‘£This paper puts forward a kind of dynamic attribute informationvalue Merit(Cj) which has a very strong leading function for getting a minimumattributes reduction;⑤With the help of Merit(Cj) this paper puts forward anattributes reduction algorithm, and demonstrates the correctness that Merit(Cj)=1 isthe sign for algorithm end (getting the reduction attributes).Based on the previous algorithms, the writer develops a rough set attributesreduction tool named DBReduct with the Visual C++ 6.0, ActiveX Data Objects andso on. This tool actualizes the automatic pick-up of the database structure, with thisfunction users can choose the wanted tables and table's attributes freely to constructtheir wanted decision tables. This tool also actualizes the operation with severalpopular relation databases with OLEDB.The writer also does some testing to DBReduct at the aspects of anti-interference,anti-repeated-data, efficiency, veracity and the rate of minimum reduction withspecial experiment data: For anti-jamming, using a simple interferential data set and alarge-scale interference data set, as a result the previous is the same as the divisionmatrix and the after is the same as Rosetta;For anti-repeated-data, using 1-800%repeated data sets, as a result, the time of the reduct term is changeless and the time ofthe pretreatment is increased slowly;For efficiency, using 16 data sets including 1380to 22080 objects, as a result, it spends 1.454 to 40.344 seconds, and the relationbetween object quantity and consuming time is nearly linear;For veracity and the rateof minimum reduction, using 11 data sets from Rosetta software, as a result, thesolution of DBReduct is not only in the solutions of Rosetta, but as the shortest one.So all demonstrate that the DBReduct has nicer capability at these aspects.The DBReduct' s advantages are as follows: â‘ It directly operates the database,without need to output the data from database. Output of the mass data is laborious inlarge database table;â‘¡It is an all-purpose tool that can calculate attributes reductionin several popular relation database;â‘¢ It supports the all data types, without dataconversion;â‘£ With the help of database management system's operation tools, it hasthe high efficiency in the operation of projection, statistics and orientation. |