Font Size: a A A

The Research Of Ordering Problems And Data Reduction Based On Rough Sets

Posted on:2007-03-27Degree:MasterType:Thesis
Country:ChinaCandidate:L M ZhangFull Text:PDF
GTID:2178360182996025Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Rough Set Theory, introduced by Z.Pawlak early 1980's, is a newmathematical tool to deal with vagueness and uncertainty whose basic idea isto derive classification rules of conception by knowledge reduction with theability of classification unchanged. In recent years, rough set theory hasbecome one of the most active research fields of artificial intelligence andinformation science, and has been successfully applied to many areas such asdata mining, pattern recognition, machine learning, knowledge discovery,decision analysis, and so on.The ordering problems and data reduction in rough sets have become animportant research topic. Finding out the relationship between the wholeordering of the object and ordering of various attributes is the key of theordering problems. But data reduction may cause the knowledge expressionof the information system succinctly, cause the knowledge hidden in datatables more clear, cause the expansion ability of the rules in decision tablesmore stronger, cause the adaptiveness more better. This paper is mainly theresearch of ordering problems and data reduction based on rough sets, themain research of this paper is as follow:(1) Firstly, introduce the research area of the data mining and rough sets,and the rise of rough sets, and point out the characteristics of the rough sets.Secondly, introduce the basic theory of the rough sets, and the rough sets isan imprecise concept that using lower approximation and upperapproximation based on indiscernibility relation.(2) Because of the different definitions of the importance of attribute, weintroduce the reduction algorithm based on dependence, the reductionalgorithm based on quantity of the classification, the reduction algorithmbased on attribute frequency in discernibility matrix. And compare andanalyze these algorithms. To the same decision table or information system,the reduction algorithm based on dependence and the reduction algorithmbased on quantity of the classification at most only miss a constant factor. Sowe can infer that the value of the same attributes in the same decision table orinformation system may be different, but the orderings about the importanceof attributes are the same. Therefore, the two algorithms get the samereduction in the same decision table or information system, and the reductionalgorithm based on dependence and reduction algorithm based on quantity ofclassification are fully equivalence. In addition, we analyze the computationalcomplexity of these algorithms.(3) Ordering objects is a basic problem in decision and plays animportant roles in designing the intelligence information system. Miningordering rules is to find out the relationship between the whole ordering ofthe object and ordering of various attributes. Because the whole ordering ofthe object is not necessarily consistent with the ordering of various attributes.Under such circumstances, we would like to know which attributes indetermining the whole ordering played a more important role, whichattributes have no effect, which attribute sets are the full to determine thewhole ordering and the information of dependence in attributes. Therefore forthe ordering problems, we first introduced the concept of the orderinginformation table, and the ordering information table can be seen as theexpansion of a general information table adding the semantic. Secondly, weintroduced a data analysis method. This method profited from the concept ofreduction and core in rough sets theory, through analyzing the dependence ofattributes in ordering information tables, defined the concept reduction andcore in ordering information tables. Thirdly, we introduced and formalize theproblems of mining ordering rules in ordering information table, and gave amethod of mining ordering rules. Because the method expanded the scope ofdata mining, finally we proposed the reduction algorithm based ondiscernibility matrix of ordering information table, and certificated that thereductions between this algorithm and the algorithm of mining orderinginformation table are the same and this algorithm hasn't expand the scope ofdata mining by example.(4) In the same decision system, we can make various differentapplication knowledge. But not all knowledge is necessary for a user. Howgive the knowledge which the user wants according to the user's requirementis always one of the importance of algorithm concerned. We find outreduction that user satisfied according user's requirement. Firstly, weintroduced the problem of the reduction of meeting user's requirement, andthe reduction algorithm based on free attributes by WangJue proposed. Itsviewpoint is to retain the attributes that user wants as many as possible. Thealgorithm is that we retain the attribute from beginning to ending by attributeordering. After retaining a attribute, delete one or several attribute fromdiscernibility matrix. These attributes deleted named free attributes are thatthe user has no interest and reserve the attributes retained can be the part ofone reduction. Secondly, we proposed a reduction algorithm meeting user'srequirement named reduction algorithm based on attributes deleting. Thepurpose of this algorithm is to delete the attributes that user don't want asmany as possible. The algorithm is that we delete attributes from ending tobeginning by attribute ordering. This algorithm can get different reductionsby different user's requirements, and we certified the algorithm is completefor reduction and analyze the computational complexity. The computationalcomplexity of reduction algorithm based on attributes deleting is O (C M ), andCM is the number of discernibility elements of discernibility matrix. Thereduction algorithm based on free attributes uses absorption anddiscernibility elements absorb each other, so the computational complexity isO (C M 2). From the contrast, the result illustrated that the computationalcomplexity of reduction algorithm based on free attributes is lower, and thecomputational complexity is from the square to linear. Through the algorithmtesting, it certified the reduction algorithm based on attributes deleting canget better reduction than reduction algorithm based on free attributes...
Keywords/Search Tags:Reduction
PDF Full Text Request
Related items