Font Size: a A A

Study On Cost-sensitive Based Attribute Reduction Of Decision-theoretic Rough Sets

Posted on:2018-02-15Degree:MasterType:Thesis
Country:ChinaCandidate:C LiuFull Text:PDF
GTID:2348330542983635Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
The key techniques in classification of decision-theoretic rough sets include data preprocessing,attribute reduction,classification algorithm and so on,in which the attribute reduction occupies an important position whose effect influences the final data classification effect to a great extent.Thus in this dissertation,the study is focused on two key techniques in classification:attribute reduction of decision-theoretic rough sets and fuzzy decision-theoretic rough sets,and the cost sensitivity method with practical significance.By studying relevant theories of cost sensitivity,the cost sensitive method including test cost and error cost is integrated into positive region reduction for decision-theoretic rough sets(PRDTRS),and combined with simulated annealing algorithm,the TCSPR(Test-Cost Sensitive Positive Region-based reduction algorithm for DTRS)algorithm was proposed.The experimental results show that TCSPR algorithm can find a positive region reduction with less attributes and lower cost in a polynomial time.The classical PRDTRS algorithm for attribute reduction in DTRS can only deal with discrete data.And the QuickReduct algorithm for positive region attribute reduction in fuzzy set has the following shortages:long running time,the cost of attributes is not taken into account,only does reduction from the perspective of mathematics,and so on.In order to make up for the deficiencies of QuickReduct algorithm,it is proposed a fuzzy decision-theoretic rough sets attribute reduction algorithm named COSAR which is combined cost sensitivity with the thought of fuzzy theory.In COSAR,the similarity membership function is introduced into computation of the upper and the lower approximation set of decision-theoretic rough sets,and combining with cost sensitive method to carry on the cost sensitive attribute reduction for fuzzy decision-theoretic rough sets.By simulation experiments,it is verified that the reduced attribute set with lower total cost can be found in a shorter time.Based on the study of Email Classification techniques,and combined with TCSPR algorithm,it is proposed the TCSPR email classification method.By using the email dataset of UCI database(Spambase),a simulation experiment is made by the train samples and test samples which is randomly sampled from Spambase.The experimental results are given as follows.TCSPR email classification algorithm can not only reduce the test cost of the acquired characteristic words effectively,but also save a lot of time under the circumstance of correct classification of email.What's more,the experiment has also verified that TCSPR email classification algorithm is a novel method that is effective in email classification and improves the efficiency of classification.
Keywords/Search Tags:Decision Theoretic Rough Sets(DTRS), Fuzzy decision theoretic rough sets(FDTRS), Cost-sensitive, Attribute reduction
PDF Full Text Request
Related items