Font Size: a A A

Research And Application Of Rough Set Attribute Discretization And Attribute Reduction

Posted on:2020-04-12Degree:MasterType:Thesis
Country:ChinaCandidate:J SongFull Text:PDF
GTID:2438330620955603Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the advancement and development of technology,the scale of data that people can collect is constantly increasing.How to extract valuable information from these data has become a key issue of concern.Rough set is a kind of information processing tool.It has been researched and developed for decades and has been widely used in many fields such as machine learning and data mining.Classic rough sets can only handle discrete data.Discretization is required to process continuous data.Discretization methods are commonly classified into supervised and unsupervised algorithms.Category attribute isn't considered in process of unsupervised algorithms.So these methods are efficient,but the classification accuracy of discretized data is poor.The supervised discretization algorithm guides the discrete process by means of class attributes,so that a better discretization effect can be obtained.At present,this kind of methods is commonly suitable for the discretization of single attribute and relations between attributes aren't taken into consideration.It is difficult to get a smaller and better set of breakpoints.Attribute reduction is essential in the theory of rough set.Many researchers have proposed lots of excellent methods based on different theories,among which Difference Matrix based method has drawn extensive attention because of its simplicity.However,there are a large number of redundant differential information elements in the matrix.These redundant elements not only affect experimental result but also has high cost of storage.Many researchers have tried different data structures to solve this problem.For example,linked lists are used to store these elements in attribute reduction.This method reduces the spatial cost to a certain extent,but it only deletes the differential information elements containing core,and those that do not contain core and duplicate elements are not removed.The outline of this paper is as follows: firstly,discretization based on binary ant colony algorithm and variable precision rough set is proposed.Binary ant network is first built using candidate breakpoints,then global optimal breakpoints is searched in the network.A fitness function is defined,using the number of breakpoints and approximation classification accuracy of variable precision rough set.Experiment results show that the proposed discretization method can reduce the number of breakpoints and improve the classification accuracy compared with other methods;secondly,a binary linked list and a new attribute reduction method are proposed to solve the problems of linked list.Several data sets are used to verify validity of these methods.Attribute reduction algorithm proposed in this paper can obtain the reduction set faster than other algorithms.Finally,these algorithms are used to implement a wine quality prediction system.
Keywords/Search Tags:Rough Set, Attribute Reduction, Discretization, Binary Ant Colony, Binary List
PDF Full Text Request
Related items