Font Size: a A A

Research On DTRS Attribute Reduction Considering Classification Cost And Precision

Posted on:2019-02-25Degree:MasterType:Thesis
Country:ChinaCandidate:W Q ChenFull Text:PDF
GTID:2428330545467619Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Every day,the world produces a huge amount of data.These data are not only different in size and complexity from the data of the past,but also have many characteristics such as uncertainty,inconsistency,and ambiguity.Therefore,how to efficiently obtain knowledge from these data is imminent.Based on this,decision theoretic rough sets as an area of data mining classification technology,because of its effectiveness in solving such problems,has received many scholars' attention and research.At present,the research on decision theoretic rough sets can be mainly divided into two categories:(1)the cost that does not need to be considered for classification;(2)the cost(ie,cost-sensitive)is considered,and the cost is minimized.In the first type of research,the goal is to obtain the set of attributes with the highest classification precision.In the second type of study,the goal is to obtain a set of attributes with the lowest classification cost.Therefore,the number of attributes after the reduction of the second type of attributes will become less,but at the same time it also brings about the problem of low classification precision.In practical applications,it is necessary to properly reduce the cost and reduce the number of attributes,but the classification precision is undoubtedly more important.Therefore,this paper focuses on the balance between classification cost and classification precision in the attribute reduction of the decision rough set.The main research work completed is as follows:1.Makes a thorough literature research on the research status and development trends of rough set attribute reduction at home and abroad,and understands the research frontiers in this field,and determines the thesis research theme.2.The attribute reduction under the constraint of classification cost is studied.The classification cost in the cost-sensitive attribute reduction mainly includes the misclassification cost,the test cost,or the total cost including both.After the reduction,the subset of attributes with the lowest cost is obtained,but the classification precision of such attribute sets is often not high.In view of this,the paper considers the balance of classification cost and classification accuracy,and proposes an attribute reduction algorithm(abbreviated as ARAIM algorithm)based on attribute importance degree and risk decision rough set under the constraint of classification cost.The algorithm adopts the idea of greedy algorithm.Every time the attribute with the highest attribute importance is selected,if the attribute still meets the classification cost constraint and the approximate classification quality is improved,the attribute is added to the reduced attribute set.Experimental studies show that,under the constraint of cost,the algorithm can find a set of attributes with better approximate classification quality.Compared with attribute reduction set without considering the cost,the algorithm has a very small difference in the approximate classification quality.3.This paper studies the problem of attribute reduction with the highest classification precision under the constraint of classification cost.According to the above ARAIM algorithm for attribute reduction,the set of attributes with good classification precision under the constraint of the classification cost is obtained,but the attribute set with the highest classification precision cannot be guaranteed.In order to solve this problem,this paper considers the classification cost and precision comprehensively,and combines the simulated annealing algorithm to search and optimize,and proposes a decision rough set attribute reduction(ARACOQ)algorithm based on cost-sensitive and approximate classification quality.The algorithm uses the simulated annealing algorithm to explore the random combination of different attributes to search for attribute reduction sets that meet the constraints and have the highest classification precision.Experimental results show that the ARACOQ algorithm can find attribute reduction sets with the highest classification precision that meet the classification cost constraints in polynomial time.
Keywords/Search Tags:Decision theoretic rough sets(DTRS), Classification cost, Attribute reduction, Uncertainty measure, Precision, Approximate classification quality, Simulated annealing algorithm
PDF Full Text Request
Related items