Font Size: a A A

Research Of Categorization Algorithm Based On Rough Sets Theory Attributes Reduction

Posted on:2013-04-15Degree:MasterType:Thesis
Country:ChinaCandidate:Y H DuFull Text:PDF
GTID:2248330371497323Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Data mining provide effective means to analyse large amounts of information and classification is an important task of it. In recent years, Rough sets theory and its applications have received considerable attention,and have been applied in categorization Algorithms. Rough sets theory, initialized by Professor Pawlak in early198(Ts,has been proved to be an excellent mathematical tool dealing with uncertain and vague description of objects, whose basic idea is to derive categorization rules of conception by knowledge reduction with the ability of classification unchanged.Attribute reduction of RS theory is discussed emphatically in this paper, and we have proved that calculating reduction and minimum reduction is a NP-hard problem.Firstly,we analyse basic algorithms and heuristic reduction algorithm which based on discernibility matrix, and present a kind of improved heuristic algorithm based on discernibility matrix which is proved effective by experiment analysis. In addition, on the base of traditional discernibility matri,we present a kind of improved discernibility matrix, to reduce decision table before structuring discernibility matrix, which will avoid blank elements appearing in matrix, by which means to reduce memory space and run times. And on the base of the improved discernibility matrix,we present a kind of new attribute reduction algorithm. By the means of calculating minimum indistinguishable attribute object instead of calculating Attribute object with strong ability of distinguish,we can reduce the complexity of calculating. It is proved by experiments that the improved calculating method enhances the performance of classification precision.A hybrid text categorization recognition modle is presented based on the combination RS theory and BP neural network.By classifying Chinese text corpus, offered by Ronglu LI, Fudan University, following appearsrText classification algorithm with RS theory and BP neural network combined used has high precision rate, recall rate and Fl value.
Keywords/Search Tags:Rough sets, Discernibility Matrix, Attributes Reduction, Neural network, Text categorization recognition
PDF Full Text Request
Related items