Knowledge of Discovery in Database(KDD) was first proposed in the late 1980s.KDD is the nontrivial process to find out valid,potentially useful to users,and ultimately understandable patterns from huge volumes of data.Rough Sets Theory is an effective method in KDD and has also been applied widely.Rough Sets Theory,presented by Pawlak in the early 1980s,is an excellent data analysis tool to handle uncertain information,such as imprecise,inconsistent, incomplete information,etc.It has received more and more attention of the researchers around all over the world and has been applied to many areas successfully. Knowledge reduction is not only one of the basic contents in Rough Sets Theory,but it is also the key technology and important research contents in knowledge discovery. Efficient and effective algorithms for knowledge reduction are the foundation of the applications of Rough Sets Theory,and so is the guarantee to be applied on a large scale.Knowledge reduction has become one of the hottest research fields.Knowledge reduction is introduced mainly from attribute reduction and value reduction.The existing attribute reduction algorithms,including the one based on discernible matrix,the one based on the importance of attributes and the one based on frequency of attributes,are all introduced and analyzed in detail.We can simplify the knowledge representation system with the attribute reduction,but it is not optimal because there may be still some redundant information.To solve this problem,value reduction is necessary to be introduced and researched.The process of value reduction is the one to delete the redundant attribute for every record.Two general kinds of value reduction algorithms,the one based on core value and the heuristic one, are introduced and discussed.In view of the deficiencies of the algorithm based on value core,some pieces of advice are proposed to enhance its efficiency and rationality.To discuss the limits existed in the algorithm based on discernible matrix,an example is given and analyzed.After the analysis,we can find that some redundant attributes may be included in the core when the decision table is incompatible and has the two characteristics shown as follows:(1)The incompatible elements(or recor(?)) in the decision table have various values in some condition attribute and we create a set of C' to save these attributes meeting the requirement.(2)To some condition attribute X_i in the set of C',if its homologous decision value can get all the values when X_i takes a particular value,then we can add X_i into the set of N.After the series of process,N is not null.That is to say,N has one element at least.If this happened,we can not get the simplest reduction because of the redundant elements included in the set of N,which is resulted in by the traditional algorithms.To make up for the shortage of these algorithms above and delete the redundant attributes,two improved attribute reduction algorithms are proposed in the paper, which delete the redundant attributes from decision table and discernible matrix respectively.(1)We should find the redundant condition attributes before computing the discernible matrix when the data or records we are dealing with are easy resulted in incompatible and the decision attribute can get all the values once a condition attribute takes a certain value.Then we delete the redundant attributes from decision table to optimize the table.According to the optimized table,we can compute the discernible matrix,which contains element consisted of single attribute.This is the first improved algorithm.(2)Vice versa,when the data or records we are dealing with are not easy resulted in incompatible and the decision attribute can not get all the values once a condition attribute takes a certain value,we should discuss the elements consisted of single attribute after computing the discernible matrix.If the element is redundant,we should change its value into zero(0).That expatiated above is the main idea of the second improved algorithm.The paper proves theoretically that the two improved algorithms are correct,and the results of the examples as well as the experimental simulation have shown the rationality and effectiveness of them.Finally,a new clustering algorithm,CRSWSN for short,is proposed by applying the Rough Sets Theory to the Wireless Sensor Networks.The important factors to cluster head are found and the clusters are designated by attribute reduction,and the members of every cluster are decided using the relative knowledge of Rough Sets Theory.Then the effective regulations of decision are formed by value reduction to keep the clusters. |