Font Size: a A A

Research On Knowledge Reduction Based On Rough Set Theory For The Inconsistent Decision Systems

Posted on:2017-04-19Degree:DoctorType:Dissertation
Country:ChinaCandidate:H GeFull Text:PDF
GTID:1108330485464103Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the quick development of Internet, cloud computing, Internet of things, smart city, social networking and so on, all trades and professions, such as Industrial, agriculture, health, education and scientific research, generate a lot of data. The amount of data in the global scope is rapidly increasing. With the size and dimension of the data increasing, the large scale and high dimension data sets have been produced. The data sets exist a lot of uncertain information, and the knowledge discovery is a process of extracting valuable and meaningful knowledge from numerous uncertain data. Rough set theory (RST) is one of knowledge discovery tools, which can effectively deal with imprecise and incomplete information. At present, in the artificial intelligence, pattern recognition, machine learning, decision analysis, and so on, rough set theory has been developed and applied widely.The biggest characteristic of rough set theory is that it does not depend on any prior knowledge and can find potential and valuable knowledge from uncertain data. The diversity of data acquisition and uncertainty of data discrimination lead to produce a lot of inconsistent data in the data sets. The inconsistent data mainly show that at least two objects in the universe have the same conditions property values, but decision property values different from each other. The inconsistent of data also reflect characteristics of the conflict in knowledge systems. The knowledge representation and knowledge discovery of the inconsistent knowledge system are one of the significant research domains.Knowledge reduction is one of the key contents of rough set theory. Reduction can reduce data dimensions, simplify the data representation and improve the classification accuracy and data processing efficiency. In this thesis, from the viewpoints of the discernibility matrix and the relative discernibility ability, knowledge reduction is deeply investigated and researched for the inconsistent decision systems. The main work and innovation are shown as follows:(1) We discuss the different type discernibility matrix descriptions and realizations of the corresponding reducts. According to the discernibility matrix containing different distinguish information quantity, the concepts and representations, i.e. Hu discernibility matrix, Boolean discernibility matrix, structure discemibility matrix and potential of structure discernibility matrix, are proposed and the relation of these concepts are researched. The definitions of H-reduct, S-reduct, B-reduct and F-reduct are given. Next, for four reducts, based on two reduction strategies of addition and deletion, the general reduction methods based on the discernibility matrix are presented. Using UCI data sets, we test performance and results of different reduction algorithms. Experiment results show that the reduction results are concordant with same reduction strategy, but time and space performance are different.(2) Based on the discernibility matrix, we study the general reduction methods for the inconsistent decision table. The purpose for preserving different properties of an original decision table will lead different types of reducts. Aiming at five representative reducts, the form of generalized of decision table is presented. Based on the generalized decision table, the generalized discemibility matrix and the generalized discernibility function are defined. The relations of different types of reducts based on the discernibility matrix are analyzed and the transformation principles of different discemibility matrices are researched. Based on the given arbitrary reduct and the corresponding discernibility matrix, theory and method are studied for obtaining other reducts. In addition, based on the H-reduct and Hu discernibility matrix, theory and method are also studied for obtaining other reducts.(3) We research the general reduction methods from the viewpoint of the relative discernibility in the inconsistent decision tables. When dealing with the large data sets, the reduction method of the discernibility matrix exist some shortcomings. For the inconsistent decision tables, based on the generalized decision table, the concepts, properties and reduction definitions of the relative discernibility are given. The equivalence between the relative discernibility reduct and the discernibility matrix reduct are analyzed. And then, two general reduction algorithms of addition and deletion strategies (GARA-FSA and GARA-BSA) are designed. A serial of experiments with UCI data sets are implemented for evaluate the effectiveness and performance of the proposed reduction algorithms. Experiment results show that when dealing with the inconsistent decision table, GARA-FSA and GARA-BSA have been greatly improved comparing with the discernibility matrix-based reduction algorithms.(4) We research the positive region reduct from the viewpoint of the relative discemibility. The relationship of Hu discernibility matrix reduct and the relative discemibility reduct, and relationship of Yang discernibility matrix reduct and the positive region reduct are studied. The relation of Hu discernibility matrix and Yang discernibility matrix is discussed. That Hu discernibility matrix is transformed to Yang discernibility matrix is mapped to the transformation of from H-relative discernibility to P-relative discernibility. Two strategies of computing P-relative discemibility, i.e. rectifiable computational strategy (RCS) and direct computational strategy (DCS), are researched, and then two corresponding positive region reduction algorithms (RCSRA and DCSRA) based on P-relative discernibility are designed. Examples and experiments are used to verify the performance and effectiveness of two algorithms (RCSRA and DCSRA). Two algorithms can effectively obtain the positive region redact, enhance efficiency of reduction algorithms and avoid the shortcomings of the discernibility matrix-based reduction method.(5) We study the acceleration strategy of the general reduction methods based on the relative discernibility. For large-scale and high-dimensional data sets, the laws and natures of dividing equivalence classes for the decision information systems are researched, and an acceleration strategy of reduction algorithms for computing equivalence classes by reducing the number of radix sort in the reduction process is proposed. By the acceleration strategy, the paper improves two general reduction algorithms (GARA-FSA and GARA-BSA), and presents two general quick reduction algorithms (QGARA-FSA and QGARA-BSA). The experimental results show that the proposed quick general reduction algorithms are effective and feasible for high dimensional and large data sets.
Keywords/Search Tags:rough set theory, the inconsistent decision system, the generalized decision table, knowledge reduction, the discernibility matrix, the relative discernibility
PDF Full Text Request
Related items