Font Size: a A A

Study On Methods For Uncertainty Measure And Attribute Reduction Based On Rough Set Theory

Posted on:2011-06-25Degree:DoctorType:Dissertation
Country:ChinaCandidate:S H TengFull Text:PDF
GTID:1118330341951710Subject:Electronic Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of technology of acquiring data, humans are difficult to deal with this rapidly expanding amount of data. In order to solve the problem of data rich and information poor, how to acquire new, potential, correct and valuable knowledge in very large, mussy and noisy databases, has become one of the key research fields in intelligent information processing.As a new method of knowledge discovery, rough set theory (RST) has been widely used in many areas, and one of the essential applications is attribute reduction. With the development for almost 30 years, the theory and method of attribute reduction get a fast development and perfection. However, there are still some problems. Firstly, uncertainty measure of rough set is very important in attribute reduction, but the existing uncertainty measures can not well evaluate the attribute importance. Thus it is a fundamental issue to find more reasonable uncertainty measure. Secondly, there is no universal and efficient algorithm for attribute reduction, which limits the application of rough set. Based on these considerations, dissertation here performs a systematic study on uncertainty measure and attribute reduction in information system. The main work and innovation are as follows:(1) Some well-justified measures of uncertainty based on discernibility capability of attributes are put forward in general binary relation, and an explicit theoretical meaning of rough set is given to the new measures by intuitive venn diagram representation. These results are very helpful for understanding the essence of uncertainty measures in RST, enrich the intension of RST, and provide a theoretical basis for further algorithms of attribute reduction.(2) Some new kinds of weighted uncertainty measures calledαentropy,αconditional entropy,αmutual information, are presented by considering the subjective weights of data under general binary relation, and thus the differences between various uncertainty measures are analyzed by changing the parameterα. Further, the existing definitions of uncertainty measures become the special forms of the weighted uncertainty measures. Thus the proposed measures unify the existing uncertainty measures in general binary relation. Especially,the weighted uncertainty measures provide a profitable tool for combining of factors such as the decision maker's preferences and prior knowledge, therefore they are accord with fact.(3) A well-justified weighted integration measure of uncertainty is proposed in general binary relation, which is more accurate and has a wide application. Theoretical analysis and examples demonstrate that the weighted integration uncertainty measure can completely reflect two factors of uncertainty, and it is consistent with human cognition. Therefore it overcomes the limitations of existing uncertainty measures. (4) In order to accelerate the process of attribute reduction, the discernibility ability of attributes is chosen as the heuristic function. Firstly, a new efficient heuristic reduction algorithm is proposed based on the indiscernibility degree in general information system, which is useful to deal with the noise. Secondly, a discernibility view-based attribute reduction algorithm is constructed in decision information system, and thus we make a comparative study on the quantitative relationships among the concepts of attribute reduction from the algebra viewpoint, information viewpoint, and discernibility viewpoint. At last, we test our algorithms versus other algorithms on simulation datasets and UCI datasets, the experimental results show the proposed reduction algorithms have the minimum number of selected attributes with the shortest time in most cases, and are feasible to deal with large data sets.(5) The relationship between attribute reduction from discernibility viewpoint and the existing reduction algorithms of inconsistent decision information systems is presented. Then, in order to simplify decision table, some simplified consistent decision tables are defined, based on which an efficient attribute reduction algorithm is designed in inconsistent decision information systems. Experimental results show the effectiveness and practicability of this algorithm on the large inconsistent data sets.(6) Considering the noise in decision information systems, we propose two kinds of approximate attribute reduction algorithms, such as AAR-DV algorithm and AAR-WαA algorithm, which can be used to deal with noise and be applicable to many extending model of rough sets. Especially, the prior knowledge of data is considered in AAR-WαA algorithm. Experimental results demonstrate that the proposed approximate attribute reduction algorithms can effectively improve sensitivity to noise, get more compact reduct, and simultaneously improve the classification performance.(7) Considering that combination of multiple reducts will produce more complementary information, we propose a new classification algorithm of combining more reducts based on weightedαaccuracy under general binary relation. The experimental results show that the proposed classification algorithm not only does not increase the time complexity, but also improve the classification accuracy with fewer features.In summary, the proposed uncertainty measures and reduction algorithms have very specific meaning of RST and better adaptability, and they are simple and easily understood. Therefore they have preferable theoretical and practical worth.
Keywords/Search Tags:Rough set, Uncertainty measure, Attribute reduction, Discernibility matrix, Inconsistent decision table, General binary relation
PDF Full Text Request
Related items