Font Size: a A A

Data Analysis Methods For Inconsistent Decision Tables

Posted on:2014-11-23Degree:DoctorType:Dissertation
Country:ChinaCandidate:L Z YinFull Text:PDF
GTID:1268330401979300Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Abstract:Rough set theory is a mathematical tool to deal with uncertaint information. It can abstract the effective knowledge from the original data set by three steps, which calculate a core set, a reduct and a rule set, respectively. However, rough set theory sufferes from some improtant problems when it is used in an inconsistent decision table, such as the inconsistency of core sets, the selection strategy on data analysis processes, NP hard problem on minimal reduct, etc. In this paper, we study the related solutions and proposed a systematic data analysis method of inconsistent decision tables. The main works and contributions are listed as follows:1, There are many core calculation methods for inconsistent decision tables. However, the results of these methods are always inconsistent. An important problem emerges that how many effective core sets are there in inconsistent decision tables. To resolve it, we propose a method based on partition of knowledge granules to judge the effectiveness of the exisiting core calculation methods, and to calculate all the effective core sets. First, information types of the granules are analyzed based on the classical Pawlak model. Next, partitions of knowledge granules are defined to represent the effective information of inconsistent decision tables. On the basis, it is proved that there are only three type of partition of knowledge granule for any inconsistent decision table. Finnaly, a method based on discernibility matrices corresponding to these three partitions are proposed to calculate all the effective core sets.2, To select a proper data nalysis process in a practical application, an effective data analysis model and the related selection strategy are proposed based on partitions of knowledge granules. First, three types of rules are defined to match the partitions of knowledge granules. The relationship among the defined rules, the three discernibility matrices and the partitions of knowledge granules is then proved, which is also used to form the rule-based strategy for selecting the proper data analysis process of inconsistent decision tables. On the basis, an intelligible data analysis model for inconsistent decision tables is suggested. It can ensure that the knowledge of core set, reduct and rule set are meaningful to users.3, To calculate a minimal reduct by using a heuristic reduction algorithm, the attribute repulsion matrix is proposed to optimize the classical heuristic reduction algorithms. First, the character of attribute repulsion related to the minimal reducts is analyzed and an attribute repulsion matrix is presented. Some attribute heuristic strategies are then proposed based on the repulsion matrix. On this basis, by combining some classical addition and deletion methods, two heuristic reduction algorithms using the proposed strategies are suggested. The experimental results on some UCI data sets show that the proposed attribute repulsion matrix can completely improve the quality of reduct and is helpful for a heuristic algorithm to calculate the minimal reduct.4, An attribute-correlation based heuristic reduction method is proposed to calculate a minimal reduct. First, we define the attraction property between the attributes, and show that the attraction property is related with the repulsion property. Based on the correlation, we define the new attribute significance and propose an attribute-correlation based heuristic reduction algorithm which integrates the discernibility ability of single attributes and the correlations among attributes. It is more effective to obtain a minimal reduct.5, In order to resolve the problem that the confidences of the existing heuristic strategies can not be estimated, some new strategies with the related confidence models are proposed and integrated into a reduction algorithm. Firstly, based on the repulsion property, we propose a related heuristic strategy and design the confidence model. Next, we define the mutex properties and design the related confidence models. According to the defined confidence degrees, we suggest an integrated strategy with high-confidence, which is used in a new reduction algorithm. The experimental results show that the proposed confidence models are effective and the integrated strategy is high-confidecne to obtain a minimal reduct.6, In view of the NP-hard problem on the optimal discretization and reduction in rough set theory, the hierarchical reduction of rules is proposed by using the rule reduction to replace the attribute reduction. First, a hierarchical rule sets attraction method is proposed based on the low approximation of the single attribute. Then, we analyze the cluster properties of the hierarchical rule sets, which are used to simplify the rule sets. At the same time, based on the rule reduction and cluster properties, an optimal discretization coding way is proposed to code the different discrete intervals to the same value. On the basis, we define the equivalent decision table to simplify the traditional data analysis process based on rough set theory.
Keywords/Search Tags:core attribute set, partition of knowledge granule, attributerepulsion matrix, minimal reduct, hierarchical reduction of rules, equivalent decision table
PDF Full Text Request
Related items